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o 

Title of the Invention 



NUCLEOTIDE AND AMINO ACID SEQUENCES OF THE ENVELOPE 1 
AND CORE GENS OF HEPATITIS C VIRUS 



The present application is a continuation-in-part 
of pending U.S. Application Serial No. 08/086/428, filed on 
June 29, 1993. 

10 Field Of Invention 

The present invention is in the field of 
hepatitis virology. The invention relates to the complete 
nucleotide and deduced amino acid sequences of the envelope 
1 (El) and core genes of hepatitis C virus (HCV) isolates 

15 from around the world and the grouping of these isolates 
into fourteen distinct HCV genotypes. More specifically, 
this invention relates to oligonucleotides, peptides and 
recombinant proteins derived from the envelope 1 and core 
gene sequences of these isolates of hepatitis C virus and 

20 to diagnostic methods and vaccines which employ these 
reagents. 'A 

Background Of Invention 
Hepatitis C, originally called non-A, non-B 
25 hepatitis, was first described in 1975 as a disease 

serologically distinct from- hepatitis A and hepatitis B 
(Feinstone, S.M. et al. (1975) N. Engl. J. Med. 292:767- 
770) . Although hepatitis C was (and is) the leading type 
of transfusion-associated hepatitis as well as an important 
part of community- acquired hepatitis, little progress was 
made in understanding, the disease, until the recent 
identification of hepatitis C virus (HCV) as the causative 
agent of hepatitis C via the cloning and sequencing of the 
HCV genome (Choo, A.L. et al. (1989) Science 288:359-362). 
25 The sequence information generated by this study resulted 
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« in the characterization of HCV as a small, enveloped, 

positive -stranded RNA virus and led to the demonstration 
that HCV is a major cause of both acute and chronic 
hepatitis worldwide (Weiner, A.J. et al. (1990) Lancet 
335:1-3). These observations, combined with studies 
5 showing that over 50% of acute cases of hepatitis C 

progress to chronicity with 20% of these resulting in 
cirrhosis and an undetermined proportion progressing to 
liver cancer, have led to tremendous efforts by 
investigators within the hepatitis C field to develop 
10 diagnostic assays and vaccines which can detect and prevent 
hepatitis C infection. 

The cloning and sequencing of the HCV genome by 
Choo et al. (1989) has permitted the development of 
serologic tests which can detect HCV or antibody to HCV 
(Kuo, G. et al. (1989) Science 244:362-364). In addition, 
the work of Choo et al. has also allowed the development of 
methods for detecting HCV infection via amplification of 
HCV RNA sequences by reverse transcription and cDNA 
polymerase chain reaction (RT-PCR) using primers derived 
from the HCV genomic sequence (Weiner, A.J. et al . ) . 
However, although the development of\ these diagnostic 
methods has resulted in improved diagnosis of HCV 
infection, only approximately 60% of cases of hepatitis C 
are associated with a factor identified as contributing to 
transmission of HCV (Alter, M.J. et al. (1989) JAMA 
262T1201-1205) . This -observation suggests that effective. - 
control of hepatitis C transmission is likely to occur only 
via universal pediatric vaccination as has been initiated 
recently for hepatitis B virus. Unfortunately, attempts to 
date to protect chimpanzees from hepatitis C infection via 
administration . of recombinant, vaccines have had only 
limited success. Moreover, the apparent genetic 
heterogeneity of HCV, as indicated by the recent assignment 
of all available HCV isolates to one of four genotypes, I- 
35 IV (O)camoto, H. et al. (1992) J. Gen. Virol; 73:673-679), 
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^ presents additional hurdles which must be overcome in order 
to develop accurate and effective diagnostic assays and 
vaccines . 

For example, one possible obstacle to the 
development of effective hepatitis C vaccines would arise 
5 if the observed genetic heterogeneity of HCV reflects 
serologic heterogeneity. In such a case, the most 
genetically diverse strains of HCV may then represent 
different serotypes of HCV with the result being that 
infection with one strain may not protect against infection 
10 with another. Indeed, the inability of one strain to 

protect against infection with another strain was recently 
noted by both Farci et al. (Farci, P- et al. (1992) Science 
258:135-140) and Prince et al. (Prince, A.M. et al . (1992) 
J, Infect. Dis. 165:438-443), each of whom presented 
evidence that while infection with one strain of HCV does 
modify the degree of the hepatitis C associated with the 
reinfection, it does not protect against reinfection with a 
closely related strain. The genetic heterogeneity among 
different HCV strains also increases the difficulty 
encountered in developing RT-PCR assays to detect HCV 
.!\, infection since such heterogeneity often results in false- 

negative results because of primer and template mismatch. 
In addition, currently used serologic tests for detection 
of HCV or for detection of antibody to HCV are not 
sufficiently well developed to detect all of the HCV 
genotypes which might exist in a given blood sample • 
Finally, in terms of choosing the proper treatment modality 
to combat hepatitis infection, the inability of presently 
available serologic assays to distinguish among the various 
genotypes of HCV represents a significant shortcoming in 
that recent reports -suggest that. an. HCV -infected, patient . 
response to therapy might be related to the genotype of the 
infectious virus (Yoshioka, K. et al. (1992) Hepatology 
16:293-299; Kanai , K. et al . (1992) Lancet 339:1543; Lan, 
3j J.y.N. et al- (1992) Hepatology 16:209A). Indeed, the data 
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presented in the above studies suggest that the closely 
related genotypes I and II are less responsive to 
interferon therapy than are the closely related genotypes 
III and IV. Moreover, preliminary data by Pozzato et al. 
(Pozzato, G. et al. (1991) Lancet 338:509) suggests that 
different genotypes may be associated with different types 
or degrees of clinical disease. Taken together, these 
studies suggest that before effective vaccines against HCV 
infection can be developed, and indeed, before more 
accurate and effective methods for diagnosis and treatment 
of HCV infection can be produced, one must obtain a greater 
knowledge about the genetic and serologic diversity of HCV 
isolates* 

In a recent attempt to gain an understanding of 
the extent of genetic heterogeneity among HCV strains, Bukh 
et al. carried out a detailed analysis of HCV isolates via 
the use of PCR technology to amplify different regions of 
the HCV genome (Biikh, J. et al. (1992a) Proc. Natl. Acad. 
Sci. 89:187-191). Following PCR amplification, the 5'- 
noncoding (5' NC) portion of the genomes of various HCV 
isolates were sequenced and it was found that primer pairs 
designed from conserved regions \ of the 5' NC region of the 
HCV genome were more sensitive for detecting the presence 
of HCV than were primer pairs representing other portions 
of the genome (Bukh, J. et al. {1992b) Proc. Natl. Acad. 
Sci. U.S.A. 89:4942-4946). In addition, the authors noted 
that although many of the HCV isolates examined could be 
classified into the four genotypes described by Okamoto et 
al. (1992), other previously undescribed genotypes emerged 
based on genetic heterogeneity observed in the 5' NC region 
of the various isolates. One of the most prominent of 
these-^ newly noted genotypes comprised a group of related . 
viruses that contained the most genetically divergent 5' NC 
regions of those studied. This group of viruses, 
tentatively classified as a fifth genotype, are very 
similar to strains recently described by others (Cha, T.-A 
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et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:7144-7148; 
Chan, S-W. et al. (1992) J. Gen. Virol., 73:1131-1141 and 
Lee, C-H et al. (1992) J. Clin. Microbio. 30:1602-1604). 
In addition, at least four more putative genotypes were 
identified thereby providing evidence that the genetic 
5 heterogeneity of HCV was more extensive than previously 
appreciated. 

However, while the studies of Bukh et al. (1992a 
and b) provided new and useful information on the genetic 
heterogeneity of HCV, it is widely appreciated by those 

10 skilled in the art that the three structural genes of HCV, 
core (C) , envelope (El) and envelope 2 /nonstructural 1 
(E2/NS1) are the most important for the development of 
serologic diagnostics and vaccines since it is the product 
of these genes that constitutes the hepatitis C virion. 

15 Thus, a determination of the nucleotide sequence of one or 
all of the structural genes of a variety of HCV isolates 
would be useful in designing reagents for use in diagnostic 
assays and vaccines cince a demonstration of genetic 
heterogeneity in a structural gene(s) of HCV isolates might 

20 suggest that some of the HCV genotypes represent distinct 
serotypes of HCV^ based upon the previously observed 
relationship between genetic heterogeneity and serologic 
heterogeneity among another group of single -stranded, 
positive-sense RNA viruses, the picomaviruses (Ruechert, 

25 R.R. "Picornaviridae and their replication", in Fields, 
B.N. et al., eds. Virology, New York: Raven Press, Ltd. 
(1990) 507-548) . 



The present invention relates to cDNAs encoding 
the complete nucleotide sequence of either the envelope 1 
(El) gene or the core (C) gene of an isolate of human 
hepatitis C virus (HCV) . 

The present invention also relates to the nucleic 
acid and deduced amino acid sequences of these El and core 
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° cDNAs . 

It is an object of this invention to provide 
synthetic nucleic acid sequences capable of directing 
production of recombinant El and core proteins, as well as 
equivalent natural nucleic acid sequences. Such natural 
5 nucleic acid sequences may be isolated from a cDNA or 

genomic library from which the gene capable of directing 
synthesis of the El or core proteins may be identified and 
isolated. For purposes of this application, nucleic acid 
sequence refers to RNA, DNA, cDNA or amy synthetic variant 
10 thereof which encodes for peptides. 

The invention also relates to the method of 
preparing recombinant El and core proteins derived from El 
and core cDNA secjuences respectively by cloning the nucleic 
acid encoding either the recombinant El or core protein and 
15 inserting the cDNA into an expression vector and expressing 
the recombinant protein in a host cell. 

The invention also relates to isolated and 
substantially purified recombinant El and core proteins and 
analogs thereof encoded by El and core cDNAs respectively. 
20 The invention further relates to the use of 

^ recombinant El and core proteins, either alone, or in 

combination with each other, as diagnostic agents and as 
vaccines . 

The present invention also relates to the 
25 recombinant production of the core protein of the present 
invention to contain a second protein on its surface and 
therefore serve as a carrier in a multivalent vaccine 
preparation. Further, the present invention relates to the 
use of the self aggregating core or envelope proteins as a 
30 drug delivery system for anti-virals. 

The invention also, relates to - the use -of single-- 
stranded antisense poly- or oligonucleotides derived from 
El or core cDNAs, or from both El and core cDNAs, to 
inhibit expression of hepatitis C El and/or core genes. 
35 The invention further relates to multiple 
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computer-generated alignments of the nucleotide and deduced 
amino acid sequences of the El and core cDNAs. These 
multiple sequence alignments produce consensus sequences 
which serve to highlight regions of homology and non- 
homology between secjuences found within the same genotype 
5 or in different genotypes and hence, these alignments can 
be used by one skilled in the art to design peptides and 
oligonucleotides useful as reagents in diagnostic assays 
and vaccines. 

The invention therefore also relates to purified 
10 and isolated peptides and analogs thereof derived from El 
and core cDNA secjuences. 

The invention further relates to the use of these 
peptides as diagnostic agents and vaccines. 

The present invention also encompasses methods of 
15 detecting antibodies specific for hepatitis C virus in 
biological samples. The methods of detecting HCV or 
antibodies to HCV disclosed in the present invention are 
useful for diagnosis of infection and disease caused by HCV 
and for monitoring the progression of such disease. Such 
methods are also useful for monitoring the efficacy of 
therapeutic age^nts during the course of treatment of HCV 
infection and disease in a mammal . 

The invention also provides a kit for the 
detection of antibodies specific for HCV in a biological 
sample where said kit contains at least one purified and 
isolated peptide derived from the El or core cDNA _ 
sequences. In addition, the invention provides for a kit 
containing at least one purified and isolated peptide 
derived from the El cDNA sequences and at least one 
purified and isolated peptide derived from the core cDNA 

sequences 

The invention further provides isolated and 
purified genotype-specific oligonucleotides and analogs 
thereof derived from El and core cDNA sequences. 

The invention also relates to methods for 
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o detecting the presence of hepatitis C virus in a mammal, 
said methods comprising analyzing the RNA of a mammal for 
the presence of hepatitis C virus. The invention further - 
relates to methods for determining the genotype of 
hepatitis C virus present in a mammal. This method is 
5 useful in determining the proper course of treatment for an 
HCV- infected patient. 

The invention also provides a diagnostic kit for 
the detection of hepatitis C virus in a biological sample. 
The kit comprises purified and isolated nucleic acid 
10 sequences useful as primers for reverse-transcription 

polymerase chain reaction (RT-PCR) analysis of RNA for the 
presence of hepatitis C virus genomic RNA. 

The invention further provides a diagnostic kit 
for the determination of the genotype of a hepatitis C 
virus present in a mammal. The kit comprises purified and 
isolated nucleic acid sequences useful as primers for RT- 
PCR analysis of RNA for the presence of HCV in a biological 
sample and purified and isolated nucleic acid sequences 
useful as hybridization probes in determining the genotype 
of the HCV isolate detected in PCR analysis. 

This invention' also relates to pharmaceutical a 
compositions useful in prevention or treatment of hepatitis 
C in a mammal . 



15 



20 



25 



30 



35 



Deacriotion of Figures 
Figures 1 A-H show computer generated sequence 
alignments of the nucleotide sequences of 51 HCV El cDNAs. 
The single letter abbreviations used for the nucleotides 
shown in Figures lA-H are those standardly used in the art. 
Figure lA shows the alignment of SEQ ID N0s:l-8 to produce 
a consensus sequence for genotype- I/la.- Figure. IB. shows 
the alignment of SEQ ID NOs:9-25 to produce a consensus 
sequence for genotype Il/lb. Figure IC shows the alignment 
of SEQ ID NOs: 26-29 to produce a consensus sequence for 
genotype III/2a. Figure ID shows the alignment of SEQ ID 
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® NOs: 30-33 to produce a consensus sequence for genotype 

IV/2b, Figure IE shows the alignment of SEQ ID NOs: 35-39 
to produce a consensus sequence for genotype V/3a. Figure 
IF shows the computer alignment of SEQ ID NOs:42-43 to 
produce a "consensus" sequence for genotype 4C where the 
5 "consensus" sequence given is that of SEQ ID NO: 42. Fiofure 
IG shows the alignment of SEQ ID NOs: 45-50 to produce a 
consensus sequence for genotype 5a. The nucleotides shown 
in capital letters in the consensus sequences of Figures 
lA-G are those conserved within a genotype while 

10 nucleotides shown in lower case letters in the consensus 
sequences are those variable within a genotype. In 
addition, in Figures lA-E and IG, when the lower case 
letter is shown in a consensus sequence, the lower case 
letter represents the nucleotide foxind most frequently in 

15 the sequences aligned to produce the consensus sequence. 

In Figure IF, the lower case letters shown in the consensus 
sequence are nucleotides in SEQ ID NO: 42 which differ from 
nucleotides found in the same positions in SEQ ID NO:43. 
Finally, a hyphen at a nucleotide position in the consensus 

20 sequences in Figures lA-G indicates that two nucleotides 

were found iri^equal numbers at that position in^ the aligned 
sequences. In the aligned sequences, nucleotides are shown 
in lower case letters if they differed from the nucleotides 
of both adjacent isolates. Figure IH shows the alignment 

25 of the consensus sequences of Figures lA-G with SEQ ID 
NO: 34 (genotype 2c), SEQ ID N0:40 (genotype 4a), SEQ ID 
NO: 41 (genotype 4b) , SEQ ID NO: 44 (genotype 4d) and SEQ ID 
NO: 51 (genotype 6a) to produce a consensus sequence for all 
twelve genotypes. This consensus sequence is shown as the 

30 bottom line of Figure IH where the nucleotides shown in 

capital letters are - conserved .among, all. genotypes ., and. a - . 
blank space indicates that the nucleotide at that position 
is not conserved among all genotypes. 

Figures 2A-H show computer alignments of the 

35 deduced amino acid sequences of 51 HCV El cDNAs. The 
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single letter abbreviations used for the amino acids shovni - 
in Figxxres 2A-H follow the conventional amino acid 
shorthand for the twenty naturally occurring amino acids. 
Figure 2A shows the alignment of SEQ ID NOs: 52-59 to 
produce a consensus sequence for genotype I /la. Figure 2B 
5 shows the alignment of SEQ ID NOs: 60-76 to produce a 

consensus sequence for genotype Il/lb. Figure 2C shows the 
alignment of SEQ ID NOs: 77-80 to produce a consensus 
sequence for genotype III/2a. Figure 2D shows the 
alignment of SEQ ID NOs: 81-84 to produce a consensus 
10 sequence for genotype IV/2b. Figure 2E shows the alignment 
of SEQ ID NOs: 86-90 to produce a consensus sequence for 
genotype V/3a. Figure 2F shows the computer alignment of 
SEQ ID NOs: 93-94 to produce a consensus sequence for 
genotype 4c. Figure 2G shows the alignment of SEQ ID 
NOs: 96-101 to produce a consensus sequence for genotype 5a. 
The amino acids shown in capital letters in the consensus 
sequences of Figures 2A-G are those conserved within a 
genotype while amino acids shown in lower case letters in 
the consensus sequences are those variable within a 
genotype. In addition, in Figures 2A-E and 2G when the 
lower case letter is shown in a consensus sequence, the \ 
letter represents the amino acid found most frequently in 
the sequences aligned to produce the consensus sequence. 
In Figure 2F, the lower case letters shown in the consensus 
sequence are amino acids in SEQ ID NO: 93 which differ from 
amino acids found in the same positions in SEQ ID NO: 94. - 
Finally, a hyphen at an amino acid position in the 
consensus sequences of Figures 2A-G indicates that two 
amino acids were found in equal numbers at that position in 
the aligned sequences. In the aligned sequences, amino 
acids are shown in lower case- letters, if . they. differed from 
the amino acids of both adjacent isolates. Figure 2H shows 
the alignment of the consensus sequences of Figures 2A-G 
with SEQ ID NO: 85 (genotype 2c), SEQ ID NO: 91 (genotype 
2j 4a), SEQ ID NO:92 (genotype 4b), SEQ ID NO: 95 (genotype 4d) 
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and SEQ ID NO: 102 (genotype 6a) to produce a consensus 
sequence for all twelve genotypes. This consensus secjuence 
is shown as the bottom line of Figure 2H where the amino 
acids shown in capital letters are conserved among all 
genotypes and a blank space indicates that the amino acid 
5 at that position is not conserved among all genotypes. 

Figure 3 shows multiple sequence alignment of the 
deduced amino acid sequence of the El gene of 51 HCV 
isolates collected worldwide. The consensus sequence of 
the El protein is shown in boldface (top) . In the 

10 consensus sequence cysteine residues are highlighted with 
stars, potential N- linked glycosylation sites are 
underlined, and invariant amino acids are capitalized, 
whereas variable amino acids are shown in lower case 
letters. In the alignment, amino acids are shown in lower 

15 case letters if they differed from the amino acid of both 

adjacent isolates. Amino acid residues shown in bold print 
in the alignment represent residues which at that position 
in the amino acid sequence are genotype-specific, Amino 
acids that were invariant among all HCV isolates are shown 

20 as hyphens <-) in the alignment- Amino acid positions 

correspond to those of the HCV prototype sequence (HCV-1, 
Choo, L. et al. (1991) Proc. Natl. Acad. Sci. USA 88:2451- 
2455) with the first amino acid of the El protein at 
position 192. The grouping of isolates into 12 genotypes 

25 (I/la, Il/lb, III/2a, IV/2b, V/3a, 2c, 4a, 4b, 4c, 4d, 5a 
and 6a) is indicated. 

Figure 4 shows a dendrogram of the genetic 
relatedness of the twelve genotypes of HCV based on the 
percent amino acid identity of the El gene of the HCV 

30 genome. The twelve genotypes shown are designated as I/la, 
Il/lb, III/2a, IV/2b, V/3a, 2c,. 4a,. 4b, 4c, 4d, 5a and 6a. 
The shaded bars represent a range showing the maximum and 
minimum homology between the amino acid sequence of any one 
isolate of the genotype indicated and the amino acid 

35 sequence of any other isolate. 
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** Figure 5 shows the distribution of the complete 

El gene sequence of 74 HCV isolates into the twelve HCV 
genotypes in the 12 countries studied. For 51 of these HCV. 
isolates, including 8 isolates of genotype I/la, 17 
isolates of genotype Il/lb and 26 isolates comprising the 
5 additional 10 genotypes, the complete El gene secg[uence was 
determined. In the remaining 23 isolates, all of genotypes 
I/la and Il/lb, the genotype assignment was based on only a 
partial El gene sequence. The partially sequenced isolates 
did not represent additional genotypes in any of the 12 

10 countries. The number of isolates of a particular genotype 
is given in each of the 12 countries studied. For ease of 
viewing, those genotypes designated by two terms (e.g., 
I/la) are indicated by the latter term (e.g. la). The 
designations used for each country are: Denmark (DK) ; 

15 Dominican Republic (DR) ; Germany (D) ; Hong Kong (HK) ; India 
(IND) ; Sardinia, Italy (S) ; Peru (P) ; South Africa (SA) ; 
Sweden (SW) ; Taiwan (T) ; United States (US) ; and Zaire (Z) . 
National borders depicted in this figure represent those 
existing at the time of sampling. 

20 Figures 6A-K show computer generated sequence 

alignments of the nucleotide sequences of 52 HCV core 
cDNAs. Single letter sUDbreviations used for the 
nucleotides shown in Figures 6A-J are those standardly used 
in the art. Figure 6A shows the alignment of SEQ ID NOs: 

25 103-108 to produce a consensus sequence for genotype I/Ia. 
Figure 6B shows the alignment of SEQ ID NOs r 109-124 to 
produce a consensus sequence for genotype Il/lb. Figure 6C 
shows the alignments of the sequences comprising minor 
genotypes I/la (SEQ ID NOS: 103-108) and Il/lb (SEQ ID NOs: 

30 109-124) to produce a consensus sequence for the major 

genotype, genotype 1 . Figure 6D - shows .the . alignment of. SEQ~ 
ID NOs: 125-128 to produce a consensus sequence for 
genotype III/2a. Figure 6E shows the alignment of SEQ ID 
NOs: 129-133 to produce a consensus sequence for genotype 

35 IV/2b. Figure SF shows the alignment of the sequences of 
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minor genotypes III/2a (SEQ ID NOs: 125-128), IV/2b (SEQ ID 
NOs: 129-133) and 2c (SEQ ID NO: 134) to produce a 
consensus sequence for the major genotype, genotype 2. 
Figure 6G shows the alignment of SEQ ID NOs: 135-138 to 
produce a consensus sequence for genotype V/3a. Figure 6H 
5 shows the computer alignment of the sequences of minor 
genotypes 4a-4f (SEQ ID NOs: 13 9-145) to produce a 
consensus sequence for the major genotype, genotype 4. 
Figure 61 shows the alignment of SEQ ID NOs: 146-153 to 
produce a consensus sequence for genotype 5a. The 
10 nucleotides shown in capital letters in the consensus 

sequences in Figure 6A-I are those conserved within the 
genotype while nucleotides shown in lower case letters in 
the consensus sequences are those variable within a 
genotype. In addition, when the lower case letter is shown 
I J in the consensus sequence, the lower case letter represents 
the nucleotide fo\ind most frequently in the sequences 
aligned to produce that consensus sequence. Moreover, a 
hyphen at a nucleotide position in the consensus sequences 
in Figures 6A-6I indicates that two nucleotides were found 
20 in equal numbers at that position in the sequences aligned 
to produce the consensus sequence. Finally^' nucleotides 
are shown in lower case letters in the sequences aligned to 
produce each consensus sequence shown in Figures 6A-6I, if 
they differed from the nucleotides of both adjacent 
isolates. Figure 6 J shows the alignment of the consensus 
sequences of major genotypes 1 (Figure 6C) , 2 (Figure 6F) , 
3 (Figure 6G) , 4 (Figure 6H) , 5 (Figure 61) and 6 (SEQ ID 
NO: 154) to produce a consensus sequence for all genotypes 
and Figure 6K shows the alignment of consensus sequences of 
Figures 6A, 6B, 6D, 6E, 6G and 61 with SEQ ID NO: 134 
(genotype 2c) ; SEQ ID NO: 139 (genotype 4a) , SEQ ID NO: 141 
(genotype 4b), SEQ ID NO: 143 (genotype 4c), SEQ ID NO:145 
(genotype 4d) , SEQ ID NO:142 (genotype 4e) , SEQ ID NO:140 
(genotype 4f) and SEQ ID NO: 154 (genotype 6a) to produce a 
2^ consensus sequence for all fourteen genotypes. The 
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o nucleotides shown in capital letters in the consensus 
sequences of Figures 6J and 6K are conserved among all 
genotypes and the nucleotide shown in lower case letter 
represent the nucleotides fo\xnd most frequently in the 
sequences aligned to produce this consensus sequence. In 

5 addition, the presence of a hyphen at a nucleotide position 
in all fourteen sequences aligned in Figure 6K indicates 
that the nucleotide found at that position in the aligned 
sequences is the same as nucleotide shown at the 
corresponding position in the consensus sequences of Figure 
10 6K, 

Figures 7A-7J show computer alignments of the 
deduced amino acid sequences of the 52 HCV core cDNAs. The 
single letter abbreviations used for the amino acids shown 
in Figures 7A-7J follow the conventional amino acid short 
hand for the twenty natural occurring amino acids. Figure 
7A shows the alignment of SEQ ID NOs: 155-160 to produce a 
consensus sequence for genotype I /la. Figure 7B shows the 
alignment of SEQ ID NOs: 161-176 to produce a consensus 
sequence for genotype Il/lb. Figure 7C shows the alignment 
of the sequences comprising minor genotypes I /a (SEQ ID 
NOS: 155-160) and II Ab (SEQ ID NOS: 161-176) to produce a 
consensus sequence for the major genotype, genotype 1. 
Figure 7D shows the alignment of SEQ ID NOs: 177-180 to 
produce a consensus sequence for genotype III/2a. Figure 
7E shows the alignment of SEQ ID NOs: 181-185 to produce a 
consensus sequence for genotype IV/2b. Figure 7F shows the 
alignment of the sequences of minor genotypes III/2a (SEQ 
ID NOS: 177-180), IV/2b (SEQ ID NOS: 181-185) and 2c (SEQ 
ID NO: 186) to produce a consensus sequence for the major 
genotype, genotype 2. Figure 7G shows the alignment of SEQ 
ID NOs: 187-190 to produce a consensus sequence for 
genotype V/3a. Figure 7H shows the computer alignment of 
the sequences of minor genotypes 4a-4f (SEQ ID NOs: 191- 
197) to produce a consensus sequence for the major 
2j genotype, genotype 4. Figure 71 shows the alignment of SEQ 
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ID NOs: 198-205 to produce a consensus sequence for 
genotype 5a. The amino acids shown in capital letters in 
the consensus sequences of Figures 7A-7I are those 
conserved within the genotype while amino acids shown in 
lower case letters in the consensus sequences are those 
variable within the genotype. In addition, when a lower 
case letter is found in the consensus sequences shown in 
Figures 7A-7I, the letter represents the amino acid found 
most frequently in the sequences aligned to produce that 
consensus sequence. Moreover, a hyphen in an amino acid 
position in the consensus sequences of Figures 7A-7I 
indicates that two amino acids were found in equal numbers 
at that position in the sequences aligned to produce that 
consensus sequence. Finally, amino acids are shown in 
lower case letters in the sequences aligned to produce the 
consensus sequences shown in Figures 7A-7I if these amino 
acids differed from the amino acids of both adjacent 
isolates. Figure 7 J shows the alignment of the consensus 
sequences of major genotypes 1 (Figure 7C) , 2 (Figure 7F) , 
3 (Figure 7G) , 4 (Figure 7H) , 5 (Figure 71) and 6 (SEQ ID 
NO: 154) to produce a consensus sequence for all genotypes 
and Figure 7K shows the alignment of the consensus 
sequences of Figures 7A, 7B, 7D, 7E, 7G and 71 with SEQ ID 
NO:186 (genotype 2c), SEQ ID N0:191 (genotype 4a), SEQ ID 
NO:193 (genotype 4b), SEQ ID NO:195 (genotype 4c), SEQ ID 
25 N0:197 (genotype 4d) , SEQ ID NO:194 (genotype 4e) , SEQ ID 
NO:192 (genotype 4f) and SEQ ID NO:206 (genotype 6a) to 
produce a consensus sequence for all fourteen genotypes. 
The amino acids shown in capital letters in the consensus 
sequences shown in Figures 7J and 7K are conserved among 
all genotypes while the amino acids shown in lower case 
letters represent: amino: acids found most frequently in the 
sequences aligned to produce this consensus sequence. In 
addition, the presence of a hyphen at an amino acid 
position in all fourteen sequences aligned in Figure 7K 
indicates that the amino acid found at that position in the 
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aligned sequences is the same as the amino acid shown at 
the corresponding position in the consensus sequence of 
Figure 7K. 

Figure 8 shows phylogenetic trees illustrating 
the calculated evolutionary relationships of the different 
HCV isolates based upon the C gene sequence of 52 HCV 
isolates and the El gene sequence of 51 HCV isolates, 
respectively. The phylogenetic trees were constructed by 
the unweighted pair-group method with arithmetic mean (Nei, 
M. (1987) Molecular Evolutionary Genetics (Columbia 
University Press, New York, N.Y.), pp 287-326) using the 
computer software package "Gene Works" from 
IntelliGenetics. The lengths of the horizontal lines 
connecting the sequences, given in absolute values from 0 
to 1, are proportional to the estimated genetic distances 
between the sequences. Genotype designations of HCV 
isolates are indicated. In 45 HCV isolates, both the C and 
the El gene sequences were determined. 

Detailed Description Of Invention 
The present invention relates to cDNAs encoding 
the complete nucleotide sequence of the envelope 1 (El) and 
core genes of isolates of human hepatitis C virus (HCV) . 
The El cDNAs of the present invention were obtained as 
follows. Viral RNA was extracted from serum collected from 
humans infected with hepatitis C virus and the viral RNA 
was~ then reverse transcribed and amplified by polymerase 
chain reaction using primers deduced from the sequence of 
the HCV strain H-77 (Ogata, N. et al. (1991) Proc. Natl. 
Acad. Sci. U.S.A. 88:3392-3396). The amplified cDNA was 
then isolated by gel electrophoresis and sequenced. 

The present invention further relates to the 
nucleotide sequences of the cDNAs encoding the El gene of 
51 HCV isolates. These nucleotide sequences are shown in 
the sequence listing as SEQ ID N0:1 through SEQ ID NO: 51. 

The abbreviations used for the nucleotides are 
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^ those standardly used in the art. 

The deduced amino acid sequence of each of SEQ ID 
N0:1 through SEQ ID NO: 51 are presented in the sequence 
listing as SEQ ID NO: 52 through SEQ ID NO: 102 where the 
amino acid sequence in SEQ ID NO: 52 is deduced from the 
5 nucleotide sequence shown in SEQ ID NO:!, the amino acid 
sequence shown in SEQ ID NO: 53 is deduced from the 
nucleotide sequence shown in SEQ ID NO: 2 and so on. The 
deduced amino acid sequence of each of SEQ ID Nos: 52-102 
starts at nucleotide 1 of the corresponding nucleic acid 
10 sequence shown in SEQ ID N0s:l-51 and extends 575 
nucleotides to a total length of 576 nucleotides. 

The three letter abbreviations used in SEQ ID 
Nos: 52-102 follow the conventional amino acid shorthand for 
the twenty naturally occurring amino acids. 

The present invention also relates to the 
nucleotide sequences of the cDNAs encoding the core gene of 
52 HCV isolates. These nucleotide sequences are shown in 
the sequence listing as SEQ ID NO: 103 through SEQ ID 
NO: 154. 

The core cDNAs of the present invention were 
obtained as follows. Viral RNA was extracted from serum 
and reversed transcribed as described above for cloning of 
the El cDNAs. The core cDNAs of the present invention were 
then amplified by polymerase chain reaction using primers 
25 deduced from previously determined sequences that flank the 
core- gene (Bukh et al. (1992) ) Proc,^ Natl. Acad. Sci. 
U.S.A. . 89: 4942-4946; Bukh et al . (1993) Proc. Natl, Acad. 
Sci. U.S.A. . 90: 8234-8238). 

The deduced amino acid sequence of each of SEQ ID 
NO: 103 through SEQ ID NO: 154 are presented in the sequence 
listing, as SEQ ID NO:155 through SEQ ID NO: 206 where the 
amino acid sequence in SEQ ID NO: 155 is deduced from the 
nucleotide sequence shown in SEQ ID NO: 103, the amino acid 
sequence shown in SEQ ID NO: 156 is deduced from the 
2j nucleotide sequence shown in SEQ ID NO: 104 and so on. The 
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^ deduced amino acid sequence of each of SEQ ID NOs: 155-206 
starts at nucleotide 1 of the corresponding nucleotide 
sequence shown in SEQ ID NOs: 103-154 and extends 572 
nucleotides to a total length of 573 nucleotides. 

Preferably, the El and core proteins and peptides 
5 of the present invention are substantially homologous to, 
and most prefers^Dly biologically equivalent to, native HCV 
El and core proteins and peptides. By "biologically 
equivalent" as used throughout the specification and 
claims, it is meant that the compositions are 
10 immunogenically equivalent to the native El and core 
proteins and peptides. The El and core proteins and 
peptides of the present invention may also stimulate the 
production of protective antibodies upon injection into a 
mammal that would serve to protect the mammal upon 
challenge with HCV. By "substantially homologous" as used 
throughout the ensuing specification and claims to describe 
El and core proteins and peptides, it is meant a degree of 
homology in the amino acid sequence of the Ej. and core 
proteins and peptides to the native El and core proteins 
and peptides respectively. Preferably the degree of 
homology is in Wcess of 90, preferably in excess of 95, 
with a particularly preferred group of proteins being in 
excess of 99 homologous with the native El or core proteins 
and peptides. 

Variations are contemplated in the cDNA sequences 
shown in SEQ~ ID N0:1 through SEQ ID N0:51 and in-SEQ ID - 
NO: 103 through SEQ ID NO: 154 which will result in a nucleic 
acid sequence that is capable of directing production of 
analogs of the corresponding protein shown in SEQ ID NO: 52 
through SEQ ID NO: 102 and in SEQ ID NO: 155 through SEQ ID 
NO:206. It should- be noted that the cDNA sequences set. 
forth above represent a preferred embodiment of the present 
invention. Due to the degeneracy of the genetic code, it 
is to be understood that numerous choices of nucleotides 
2^ may be made that will lead to a DNA sequence capable of 
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*^ directing production of the instant protein or its analogs. 
As such, ^DNA sequences which are functionally equivalent to 
the sequence set forth above or which are functionally 
equivalent to sequences that would direct production of 
analogs of the El and core proteins produced pursuant to 
5 the amino acid sequences set forth above, are intended to 
be encompassed within the present invention. 

The term analog as used throughout the 
specification or claims to describe the El and core 
proteins and peptides of the present invention, includes 
10 any protein or peptide having an amino acid residue 

sequence substantially identical to a secjuence specifically 
shown herein in which one or more residues have been 
conservatively substituted with a biologically equivalent 
residue. Examples of conservative substitutions include 
the svibstitution of one polar (hydrophobic) residue such as 
isoleucine, valine, leucine or methionine for another, the 
substitution of one polar (hydrophilic) residue for another 
such as between arginine and lysine, between glutamine and 
asparagine, between glycine and serine, the substitution of 
one basic residue such as lysine, arginine or histidine for 
another, or the substitution of one;^ acidic residue, such as 
aspartic acid or glutamic acid for another. 

The phrase "conservative substitution** also 
includes the use of a chemically derivatized residue in 
place of a non-derivatized residue provided that the 
resulting protein or, peptide is. bio.logically equivalent to 
the native El or core protein or peptide. 

« Chemical derivative'* refers to an El or core 
protein or peptide having one or more residues chemically 
derivatized by reaction of a functional side group. 
Examples of such derivatized molecules, include but are not 
limited to, those molecules in which free amino groups have 
been derivatized to form amine hydrochlorides, p- toluene 
sulfonyl groups, carbobenzoxy groups, t-butyloxycarbonyl 
25 groups, chloracetyl groups or formyl groups. Free carboxyl 
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groups may be derivatized to form salts, methyl and ethyl 
esters or other - types of esters or hydrazides • . Free 
hydroxyl groups may be derivatized to form 0-acyl or 0- 
alkyl derivatives. The imidazole nitrogen of histidine may 
be derivatized to form N-imbenzylhistidine» Also included 
as chemical derivatives are those proteins or peptides 
which contain one or more naturally- occurring amino acid 
derivatives of the twenty standard amino acids. For 
examples: 4-hydroxyproline may be substituted for proline; 
5 -hydroxy lysine may be substituted for lysine; 3- 
methylhistidine may be substituted for histidine; 
homoserine may be substituted for serine; and ornithine may 
be substituted for lysine. The El and core proteins and 
peptide of the present invention also includes any protein 
or peptide having one or more additions and/or deletions of 
residues relative to the sequence of a peptide whose 
sequence is shown herein, so long as the peptide is 
biologically equivalent to the native El or core protein or 
peptide. 

The present invention also includes a recombinant 
DNA method for the manufacture of HCV El and core proteins. 
In this method, natural or synthetic nucleic a'cid sequences 
may be used to direct the production of El and core 
proteins . 

In one embodiment of the invention, the method 

22 comprises : 

~ (a) preparation of a nucleic acid sequence 

capable of directing a host organism to produce HCV El or 
core protein; 

(b) cloning the nucleic acid sequence into a 
vector capable of being transferred into and replicated in 
a host organism, such vector containing operational 
elements for the nucleic acid sequence; 

(c) transferring the vector containing the 
nucleic acid and operational elements into a host organism 

2^ capable of expressing the protein; 
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® (d) culturing the host organism under conditions 

appropriate for amplification of the vector and expression 
of the protein; and 

(e) harvesting the protein. 
In another embodiment of the invention, the 
5 method for the recombinant DNA synthesis of an HCV El 

protein encoded by any one of the nucleic acid sequences 
shown in SEQ ID N0s:l-51 comprises: 

(a) culturing a transformed or transfected host 
organism containing a nucleic acid sequence capable of 
10 directing the host organism to produce a protein, under 

conditions such that the protein is produced, said protein 
exhibiting substantial homology to a native El protein 
isolated from HCV having the amino acid sequence according 
to any one of the amino acid sequences shown in SEQ ID 
15 NOs: 52-102 or combinations thereof. 

In one embodiment, the RNA sequence of an HCV 
isolate was isolated and converted to cDNA as follows. 
Viral RNA is extracted from a biological sample collected 
from human subjects infected with hepatitis C and the viral 
20 RNA is then reverse transcribed and amplified by polymerase 
chain reaction using primers deduced from the sequence of 
HCV strain H-77 (Ogata et al. (1991)). Preferred primer 
sequences are shown as SEQ ID NOs: 207-212 in the sequence 
listing. Once amplified, the PCR fragments are isolated by 
25 gel electrophoresis and sequenced. 

In an alternative embodiment, the cdDove method 
may be utilized for the recombinant DNA synthesis of an HCV 
core protein encoded by any one of the nucleic acid 
sequences shown in SEQ ID NOS: 103-154, where the protein 
3Q produced by this method exhibits substantial homology to a 
native core protein isolated from HCV having amino acid 
sequence according to any one of the amino acid sequences 
shown in SEQ ID NOS: 155-206 or combinations thereof. 

The vectors contemplated for use in the present 
^€ invention include any vectors into which a nucleic acid 
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* sequence as described above can be inserted, along with any 
preferred or required operational elements, and Whicli 
vector can then be subsequently transferred into a host 
organism and replicated in such organisms. Preferred 
vectors are those whose restriction sites have been well 

5 documented and which contain the operational elements 

preferred or required for transcription of the nucleic acid 
sequence . 

The "operational elements" as discussed herein 
include at least one promoter, at least one operator, at 
10 least one leader secpaence, at least one terminator codon, 
and any other DNA sequences necessary or preferred for 
appropriate transcription and subsequent translation of the 
vector nucleic acid. In particular, it is contemplated 
that such vectors will contain at least one origin of 
15 replication recognized by the host organism along with at 
least one selectable marker and at least one promoter 
sequence capable of initiating transcription of the nucleic 
acid sequence. 

In construction of the recombinant expression 
vectors of the present invention, it should additionally be 
noted that multiple copies of the nucleic acid^ sequence of 
interest (either El or core) and its attendant operational 
elements may be inserted into each vector. In such an 
embodiment, the host organism would produce greater amounts 
per vector of the desired El or core protein. The number 

of multiple^ copies of „the„ nucleic, acid sequence which _may 

be inserted into the vector is limited only by the ability 
of the resultant vector due to its size, to be transferred 
into and replicated and transcribed in an appropriate host 
microorganism . 

Of course, those skilled in the art would readily 
understand that copies of both core and El nucleic acid 
sequence may be inserted into single vector such that a 
host organism transformed or transfected with said vector 
would produce both the desired El and core proteins. For 
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® example, a polysistronic vector in which multiple different 
El and/or core proteins may be expressed from a Bingle 
vector is created by placing expression of each protein 
under control of an internal ribosomal entry site 
(IRES) (Molla, A. et al- Nature . 356:255-257 (1992); Gong, 
5 S.K. et al. J. of Virol > . 263:1651-1660 (1989)). 

In another embodiment, restriction digest 
fragments containing a coding sequence for El or core 
proteins can be inserted into a suitable expression vector 
that functions in prokaryotic or eukaryotic cells. By 
10 suitable is meant that the vector is capable of carrying 

and expressing a complete nucleic acid sequence coding for 
an El or core protein. Preferred expression vectors are 
those that function in a eukaryotic cell- Examples of such 
vectors include but are not limited to vaccinia virus 
15 vectors, adenovirus or herpes viruses. A preferred vector 
is the baculovirus transfer vector, pBlueBac. 

In yet another embodiment, the selected 
recombinant expression vector may then be transfected into 
a suitable eukaryotic cell system for purposes of 
20 expressing the recombinant protein. Such eukaryotic cell 
systems include but are not limited to cell lines such as 
HeLa, MRC-5 or CV-1. A preferred eukaryotic cell system is 
SF9 insect cells. 

The expressed recombinant protein may be detected 
by methods known in the art including, but not limited to, 
Codmassie blue staining and Western blotting. 

The present invention also relates to 
substantially purified and isolated recombinant El and core 
proteins. In one embodiment, the recombinant protein 
expressed by the SF9 cells can be obtained as a crude 
lysate or it can be purified by standard protein 
purification procedures known in the art which may include 
differential precipitation, molecular sieve chromatography, 
ion-exchange chromatography, isoelectric focusing, gel 
3^ electrophoresis and affinity and immunoaf f inity 
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** chromatography. The recombinant protein may be purified by 

passage through a column containing a resin which has bound 

thereto antibodies specific for the open reading frame 
(ORF) protein. 

The present invention further relates to the use 
S of recombinant El and core proteins as diagnostic agents 

and vaccines. In one embodiment, the expressed recombinant 
proteins of this invention can be used in immunoassays for 
diagnosing or prognosing hepatitis C in a mammal. For the 
purposes of the present invention, "mammal" as used 
10 throughout the specification and claims, includes, but is 
not limited to humans, chimpanzees, other primates and the 
like- In a preferred embodiment, the immunoassay is useful 
in diagnosing hepatitis C infection in humans. 

Immunoassays of the present invention may be 
15 those commonly used by those skilled in the art including, 
but not limited to, radioimmtinoassay. Western blot assay, 
immunof luorescent assay, enzyme immunoassay, 
chemi luminescent assay, immunohistochemical assay, 
immunoprecipitation and the like. Standard techniques 
20 known in the art for ELISA are described in Methods in 

;immunodiacmosis , 2nd Edition, Rose and Bigazzi, eds., John 
Wiley and Sons, 1980 and Campbell et al.. Methods of 
Immunolocp/ . W.A. Benjamin, Inc., 1964, both of which are 
incorporated herein by reference. Such assays may be a 
25 direct, indirect, competitive, or noncompetitive 

immunoassay as described in the art (Oellerich, M. 1984.. 
Clin. Chem. Clin. BioChem 22:895-904) Biological samples 
appropriate for such detection assays include, but are not 
limited to serum, liver, saliva, lymphocytes or other 
30 mononuclear cells. 

In a preferred embodiment, test serum is reacted 
with a solid phase reagent having surf ace -bound recombinant 
HCV El and/or core protein (s) as antigen (s). The solid 
surface reagent can be prepared by known techniques for 
35 attaching protein to solid support material. These 
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^ attachment methods include non-specific adsorption of the 
\ ^protein to the support or covalent attachment of the 

protein to a reactive group on the support. After reaction 
of the antigen with anti-HCV antibody, unbound serum 
components are removed by washing and the antigen- antibody 
5 complex is reacted with a secondary antibody such as 

labelled ant i -human antibody. The label may be an enzyme 
which is detected by incubating the solid support in the 
presence of a suitable fluorimetric or calorimetric 
reagent. Other detected^le labels may also be used, such as 
10 radiolabels or colloidal gold, and the like. 

The HCV El and/or core proteins and analogs 
thereof may be prepared in the form of a kit, alone, or in 
combinations with other reagents such as secondary 
antibodies, for use in immunoassays. 
15 In yet another embodiment the recombinant El and 

core proteins or analogs thereof can be used as a vaccine 
to protect mammals against challenge with hepatitis C. The 
vaccine, which acts as an immunogen, may be a cell, cell 
lysate from cells transfected with a recombinant expression 
20 vector or a culture supernatant containing the expressed 
/\protein. Alternatively, the immunogen is a partially or 
substantially purified recombinant protein. In yet another 
embodiment, the immunogen may be a fusion protein 
comprising core protein and a second, non-core protein 
joined together such that the core portion of the fusion 
protein will aggregate and "trap" the second protein on the 
surface of the particle produced by aggregation of the core 
protein. (Molecular Biology of the Hepatitis B Virus", 
McLachlan, A. (1991) CRC Press, Boca Raton, Fla.). 
Alternatively, the core protein could be mixed with the 
second protein jja vitro to produce particles in which all 
or part of the second protein was exposed on the surface of 
the particle. Such particles would then serve as a carrier 
in a multi-valent vaccine preparation. Second proteins or 
parts thereof which could be mixed with or fused to the 
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o core protein include, but are not limited to, HCV El and 
.hepatitis B surface antigen. 

While it is possible for the iramunogen to" be 
administered in a pure or substantially pure form, it is 
preferable to present it as a pharmaceutical composition, 

5 formulation or preparation- 

The formulations of the present invention, both 
for veterinary and for human use, comprise an immunogen as 
described above, together with one or more pharmaceutical ly 
acceptable carriers and optionally other therapeutic 

[Q ingredients. The carrier (s) must be "acceptable" in the 

sense of being compatible with the other ingredients of the 
formulation and not deleterious to the recipient thereof. 
The formulations may conveniently be presented in unit 
dosage form and may be prepared by any method well-known in 

j5 the pharmaceutical art. 

All methods include the step of bringing into 
association the active ingredient with the carrier which 
constitutes one or more accessory ingredients. In general, 
the formulations are prepared by uniformly and intimately 

20 bringing into association the active ingredient with liquid 
carriers of finely divided solid carriers or\ both, and 
then, if necessary, shaping the product into the desired 
formulation. 

Formulations suitable for intravenous 

25 intramuscular, sxibcutaneous , or intraperitoneal 

administration conveniently comprise sterile aqueous _ 
solutions of the active ingredient with solutions which are 
preferably isotonic with the blood of the recipient. Such 
formulations may be conveniently prepared by dissolving the 
solid active ingredient in water containing physiologically 
compatible substances such as sodium chloride (e.g. 0.1- 
2.0m), glycine, and the like, and having a buffered pH 
compatibl with physiological conditions to produce an 
aqueous solution, and rendering said solution sterile. 
These may be present in unit or multi-dose containers, for 
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^ example, sealed ampules or vials. 

.Th «f ormulations of the. present invention may 
incorporate a stabilizer. Illustrative stabilizers are 
preferably incorporated in an amount of 0.10-10,000 parts 
by weight per part by weight of immunogens. If two or more 
5 stabilizers are to be used, their total amount is 

preferably within the range specified above. These 
stabilizers are used in aqueous solutions at the 
appropriate concentration and pH. The specific osmotic 
pressure of such aqueous solutions is generally in the 
10 range of 0.1-3.0 osmoles, preferably in the range of 0.8- 
1.2. The pH of the aqueous solution is adjusted to be 
within the range of 5.0-9.0, preferably within the range of 
6-8. In formulating the immunogen of the present 
invention, an ant i -adsorption agent may be used. 
15 Additional pharmaceutical methods may be employed 

to control the duration of action. Controlled release 
preparations may be achieved through the use of polymer to 
complex or adsorb the proteins or their derivatives. The 
controlled delivery may be exercised by selecting 
appropriate macromolecules (for example polyester, 
polyamino acids, polyvinyl pyrrolidone, 
ethylenevinylacetate, methyl cellulose, 
carboxymethylcellulose, or protamine sulfate) and the 
concentration of macromolecules as well as the methods of 
incorporation in order to control release. Another 
possible method to control the duration of action by 
controlled- release preparations is to incorporate the 
proteins, protein analogs or their functional derivatives, 
into particles of a polymeric material such as polyesters, 
polyamino acids, hydrogels, poly (lactic acid) or ethylene 
vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is 
possible to entrap these materials in microcapsules 
prepared, for example, by coacervation techniques or by 
2j interfacial polymerization, for example. 
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hydroxymethylcellulose or gelatin-microcapsules and poly 
(methylmethacylate) - microcapsules , .respectively, . or .in 
colloidal drug delivery systems, for example, liposomes, 
albumin microspheres, microeraulsions, nanoparticles, and 
nanocapsules or in macroemulsions. 

5 When oral preparations are desired, the 

compositions may be combined with typical carriers, such as 
lactose, sucrose, starch, talc, magnesium stearate, 
crystalline cellulose, methyl cellulose, carboxymethyl 
cellulose, glycerin, sodium alginate or gum arabic among 

10 others . 

The El and core proteins of the present invention 
may also be used as a delivery system for anti-virals to 
prevent or attenuate HCV infection in a mammal by utilizing 
the property of both proteins to self -aggregate in vitro to 
"trap" the antiviral within the particles produced via 
aggregation of the core and El proteins. Examples of anti- 
virals which could be delivered by such a system include, 
but are not limited to antisense DNA or RNAs. 

Vaccination can be conducted by conventional 
methods. For example, the immunogen or immunogens (e.g. 
the El protein may be administered alone or in combination 
with the El proteins derived from other isolates of HCV) 
can be used in a suitable diluent such as saline or water, 
or complete or incomplete adjuvants. Further, the 
immunogen (s) may or may not be bound to a carrier to make 
the protein (s) immunogenic. Examples of such carrier 
molecules include but are not limited to bovine serum 
albumin (BSA) , keyhole limpet hemocyanin (KliH) , tetanus 
toxoid, and the like. The immunogen (s) can be administered 
by any route appropriate for antibody production such as 
intravenous, intraperitoneal, intramuscular, subcutaneous, 
and the like. The immunogen (s) may be administered once or 
at periodic intervals until a significant titer of anti-HCV 
antibody is produced. The antibody may be detected in the 
22 serum using an immunoassay. 



20 



25 



30 



W 96/05315 



PCT/US95/10398 



15 



- 29 - 

** In yet another embodiment, the immunogen may be 

: nucleic acid sequence capable -of directing host organism 
synthesis of El and/or core protein (s) . Such nucleic acid 
sequence may be inserted into a suitable expression vector 
by methods known to those skilled in the art. Expression 
5 vectors suitable for producing high efficiency gene 
transfer ia vivo include retroviral, adenoviral and 
vaccinia viral vectors. Operational elements of such 
expression vectors are disclosed previously in the present 
specification and are known to one skilled in the art. 
10 Such expression vectors can be administered intravenously, 
intramuscularly, subcutaneously, intraperitoneally or 
orally. 

In an alternative embodiment, direct gene 
transfer may be accomplished via intramuscular injection 
of, for example, plasmid-based eukaryotic expression 
vectors containing a nucleic acid sequence capable of 
directing host organism synthesis of El and/or core 
protein (s). Such an approach has previously been utilized 
to produce the hepatitis B surface antigen in vivo and 
resulted in an antibody response to the surface antigen 
(Davis, H.L. et al. (1993) Human molecular Genetics. 
2:1847-1851; see also Davis et al. (1993) Human Gene 
Therapy . 4:151-159 and 733-740). 

Doses of El and/or core protein (s) -encoding 
nucleic acid sequence effective to elicit a protective 
antibody response against HCV infection range from about 1 
to about 500 fig. A more preferred range being about 1 to 

about 500 (ig. 

The El and/or core proteins and expression 
vectors containing a nucleic acid sequence capable of 
directing host organism synthesis of El and/or core 
protein (s) may be supplied in the form of a kit, alone, or 
in the form of a pharmaceutical composition as described 
above . 

2j The administration of the immunogen (s) of the 
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present invention may be for either a prophylactic or 
therapeutic purpose. When provided prophylactically, the 
iTnmunogen{s) is provided in advance of any exposure to HCV 
or in advance of any symptom of any symptoms due to HCV 
infection. The prophylactic administration of the 
5 immunogen serves to prevent or attenuate any subsequent 
infection of HCV in a mammal. When provided 
therapeutically, the imm\inogen(s) is provided at (or 
shortly after) the onset of the infection or at the onset 
of any symptom of infection or disease caused by HCV. The 
10 therapeutic administration of the immunogen (s) serves to 
attenuate the infection or disease. 

In addition to use as a vaccine, the compositions 
can be used to prepare antibodies to HCV El and core 
proteins. The antibodies can be used directly as antiviral 
agents or they may be used in immunoassays disclosed herein 
to detect HCV El and core proteins present in patient 
sera. . To prepare antibodies, a host animal is immunized 
using the El and/or core proteins native to the virus 
particle bound to a carrier as described above for 
vaccines. The host serum or plasma is collected following 
an appropriate time interval to provide a^composition 
comprising antibodies reactive with the El or core protein 
of the virus particle. The gamma globulin fraction or the 
IgG antibodies can be obtained, for example, by use of 
25 saturated ammonium sulfate or DEAE Sephadex, or other 
techniques known to those skilled in the art. The 
antibodies are substantially free of many of the adverse 
side effects which may be associated with other anti-viral 
agents such as drugs. 

The antibody compositions can be made even more 
compatible with- the -host system by minimizing potential- 
adverse immune system responses. This is accomplished by 
removing all or a portion of the Fc portion of a foreign 
species antibody or using an antibody of the same species 
35 as the host animal, for example, the use of antibodies from 
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human/human hybridomas. Humanized antibodies (i.e., 
nonimmunogenic in a human) may be produced, for example, by 
replacing an immunogenic portion of an antibody with a 
corresponding, but nonimmunogenic portion (i.e., chimeric 
antibodies) . Such chimeric antibodies may contain the 
reactive or antigen-binding portion of an antibody from one 
species and the Fc portion of an antibody (nonimmunogenic) 
from a different species. Examples of chimeric antibodies, 
include but are not limited to, non-human mammal-human 
chimeras, rodent -human chimeras, murine-human and rat -human 
chimeras (Robinson et al.. International Patent Application 
184,187; Taniguchi M., European Patent Application 171,496; 
Morrison et al . , European Patent Application 173,494; 
Neuberger et al . , PCT Application WO 86/01533; Cabilly et 
al., 1987 Proc. Natl. Acad. Sci. USA 84:3439; Nishimura et 
al., 1987 Cane. Res. 47:999; Wood et al., 1985 Nature 
314:446; Shaw et al . , 1988 J. Natl. Cancer Inst. 80:15553, 
all incorporated herein by reference) . 

General reviews of "humanized" chimeric 
antibodies are provided by Morrison S., 1985 Science 
229:1202 and by Oi et al., 1986 BioTechnicjues 4:214. 

Suitable "humanized*'^ antibodies can be 
alternatively produced by CDR or CEA substitution (Jones et 
al., 1986 Nature 321:552; Verhoeyan et al . , 1988 Science 
239:1534; Biedleret al. 1988 J. Immunol. 141:4053, all 
incorporated herein by reference) . 

The antibodies or antigen binding -fragments may 
also be produced by genetic engineering. The technology 
for expression of both heavy and light cain genes in ^ 
coli is the subject of the PCT patent applications; 
publication number WO 901443, WO901443, and WO 9014424 and 
in Huse et al., 1989 Science 246:1275-1281. 

The antibodies can also be used as a means of 
enhancing the immune response. The antibodies can be 
administered in amount similar to those used for other 
therapeutic administrations of antibody. For example. 
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normal immune globulin is administered at 0,02-0.1 ml/lb 
body weight during the early incubation period of other 
viral diseases such as rabies, measles, and hepatitis B to 
interfere with viral entry into cells. Thus, antibodies 
reactive with the HCV El and/or core proteins can be 
passively administered alone or in conjunction with another 
anti-viral agent to a host infected with an HCV to enhance 
the immune response and/or the effectiveness of an 

antiviral drug. 

Alternatively, anti-HCV El antibodies and anti- 
HCV core antibodies can be induced by administered anti- 
idiotype antibodies as immunogens. Conveniently, a 
purified anti-HCV El or anti-HCV core antibody preparation 
prepared as described above is used to induce anti-idiotype 
antibody in a host animal, the composition is administered 
to the host animal in a suitable diluent. Following 
administration, usually repeated administration, the host 
produces anti-idiotype antibody. To eliminate an 
immunogenic response to the Fc region, antibodies produced 
by the same species as the host animal can be used or the 
Fc region of the administered antibodies can be removed. 
Following induction of anti-idiotype anti^dy in the host 
animal, serum or plasma is removed to provide an antibody 
composition. The composition can be purified as described 
above for anti-HCV El and anti-HCV core antibodies, or by 
25 affinity chromatography using anti-HCV El or anti-HCV core 
antibodies bound to the affinity matrix; The anti-idiotype 
antibodies produced are similar in conformation to the 
authentic HCV El or core protein and may be used to prepare 
an HCV vaccine rather than using an HCV El or core protein. 

When used as a means of inducing anti-HCV virus 
antibodies -in an- animal , . ^the manner of injecting. the 
antibody is the same as for vaccination purposes, namely 
intramuscularly, intraperitoneally, subcutaneously or the 
like in an effective concentration in a physiologically 
2j suitable diluent with or without adjuvant. One or more 
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booster injections may be desirable. 

The HCV El and core proteins of the invention are 
also intended for use in producing antiserum designed for 
pre- or post-exposure prophylaxis. Here an El or core 
protein, or mixture of El and/or core proteins is 
formulated with a suitable adjuvant and administered by 
injection to human volunteers, according to known methods 
for producing human antisera. Antibody response to the 
injected proteins is monitored, during a several -week 
period following immunization, by periodic serum sampling 
to detect the presence of anti-HCV El and/or anti-HCV core 
serum antibodies, using an immunoassay as described herein. 

The antiserum from immunized individuals may be 
administered as a pre-exposure prophylactic measure for 
individuals who are at risk of contracting infection. The 
antiserum is also useful in treating an individual post- 
exposure, analogous to the use of high titer antiserum 
against hepatitis B virus for post-exposure prophylaxis. 

For both in vivo use of antibodies to HCV virus- 
like particles and proteins and anti-idiotype antibodies 
and diagnostic use, it may be preferable to use monoclonal 
antibodies. Monoclonal anti-HCV El and anti-HCV core / \ 

protein antibodies or anti-idiotype antibodies can be 
produced as follows. The spleen or lymphocytes from an 
immunized animal are removed and immortalized or used to 
25 prepare hybridomas by methods known to those skilled in the 
art. (Coding, J-W. 1983. Monoclonal Antibodies: 
Principles and Practice, Pladermic Press, Inc., NY, NY, pp. 
56-97) . To produce a human-human hybridoma, a human 
lymphocyte donor is selected. A donor known to be infected 
with HCV (where infection has been shown for example by the 
presence of - antirvirus antibodies in the bipod or by virus 
culture) may serve as a suitable lymphocyte donor- 
Lymphocytes can be isolated from a peripheral blood sample 
or spleen cells may be used if the donor is subject to 
2j splenectomy. Epstein-Barr virus (EBV) can be used to 
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*» immortalize human lymphocytes or a human fusion partner can 
be used to produce human-human hybridomas. Primary ia 
vitro immunization with peptides can also be used in the 
generation of human monoclonal antibodies. 

Antibodies secreted by the immortalized cells are 
5 screened to determine the clones that secrete antibodies of 
the desired specificity. For monoclonal anti-El and anti- 
core antibodies, the antibodies must bind to HCV El and 
core proteins respectively. For monoclonal anti-idiotype 
antibodies, the antibodies must bind to ant i -El and anti- 
10 core protein antibodies respectively. Cells producing 
antibodies of the desired specificity are selected. 

The present invention also relates to the use of 
single -stranded antisense poly- or oligonucleotides derived 
from nucleotide sequences substantially homologous to those 
shown in SEQ ID N0s:l-51 to inhibit the expression of 
hepatitis C El genes. The present invention further 
relates to the use of single -stranded ant i -sense poly- or 
oligo-nucleotides derived from nucleotide sequences 
substantially homologous to those shown in SEQ ID NOs:103- 
154 to inhibit the expression of hepatitis C core genes. 
Alternatively, the anti-sense poly- or oligo-nucleotides 
may be complementary to both the El and core genes and 
hence, inhibit the expression of both hepatitis C El and 
core genes. By substantially homologous as used throughout 
the specification and claims to describe the nucleic acid 
sequences of the present invention, is meant a level of 
homology between the nucleic acid sequence and the SEQ ID 
NOs. referred to in the above sentence. Preferably, the 
level of homology is in excess of 80%, more preferably in 
excess of 90%, with a preferred nucleic acid sequence being 
in excess of 95% homologous with the DNA sequence shown in 
the indicated SEQ ID NO. These, ant i- sense poly- or 
oligonucleotides can be either DNA or RNA, The targeted 
sequence is typically messenger RNA and more preferably, a 
2j single sequence required for processing or translation of 
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® the RNA. The anti-sense poly- or oligonucleotides can be 
conjugated to a polycation such as polylysine as disclosed 
in Lemaitre, M. et al. ((1989) Proc. Natl. Acad. Sci. USA 
64:648-652) and this conjugate can be administrated to a 
mammal in an amount sufficient to hybridize to and inhibit 
5 the function of the messenger RNA. 

The present invention further relates to multiple 
computer-generated alignments of the nucleotide and deduced 
amino acid sequences shown in SEQ ID NOs: 1-206. Computer 
analysis of the nucleotide sequences shown in SEQ ID NOs:l- 
10 51 and 103-154 and of the deduced amino acid sequences 

shown in SEQ ID NOs: 52-102 and 155-206 can be carried out 
using commercially available computer programs known to one 
skilled in the art. 

In one embodiment^ computer analysis of SEQ ID 
15 NOs: 1-51 by the program GENALIGN (Intelligenetics , Inc. 

Mountainview, CA) results in distribution of the 51 HCV El 
sequences into twelve genotypes based upon the degree of 
variation of the sequences. For the purposes of the 
present invention, the nucleotide sequence identity of El 
20 cDNAs of HCV isolates of the same genotype is in the range 

^ ^ of about 85% to about 100% whereas the identity of El cDNA //\ 
sequences of different genotypes is in the range of about 
50% to about 80%. 

The grouping of SEQ ID NOs: 1-51 into twelve HCV 
25 genotypes is shown below. 
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SEIO ID N9S5 




1-8 


I/la 


9-25 


Il/lb 


26-29 


III/2a 


30-33 


IV/2b 


34 


2c 


35>39: 


V/3a 


40 


4a 


41 


4b 


42-43 


4c 


44 


4d 


45-50 


5a 


51 


€a 
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® For those genotypes containing more than one El 

nucleotide sequence, computer alignment of the constituent 
nucleotide sequences of the genotype was conducted using 
GENALIGN in order to produce a consensus sequence for each 
genotype. These alignments and their resultant consensus 
5 sequences are shown in Figures lA-G for the seven genotypes 
(I/la, Il/lb, III/2a, lV/2h, V/3a, 4c and 5a) which 
comprise more than one nucleotide sequence. Further 
alignment of the consensus sequences of Figures lA-G with 
SEQ ID NO:34 (genotype 2c), SEQ ID N0:40 (genotype 4a), SEQ 
10 ID N0:41 (genotype 4b), SEQ ID NO:44 (genotype 4d) and SEQ 
ID NO: 51 (genotype 6a) produces a consensus sequence for 
all twelve genotypes as shown in Figure IH. The multiple 
alignments of nucleotide sequences shown in Figures lA-H 
produce consensus sequences which serve to highlight 
15 regions of homology and non- homology between sequences 

found within the same genotype or in different genotypes 
and hence, these alignments can be used by one skilled in 
the art to design oligonucleotides useful as reagents in 
diagnostic assays for HCV. 

Examples of purified and isolated oligonucleotide 
sequences ^derived from the consensus sequences shown in 
Figures lA-H include, but are not limited to, SEQ ID 
NOs: 213-239 where these oligonucleotides are useful as 
"genotype-specific" primers and probes since these 
oligonucleotides can hybridize specifically to the 
nucleotide sequence of the El gene of HCV isolates 
belonging to a single genotype. The genotype-specificity 
of the oligonucleotides shown in SEQ ID NOs: 213 -239 is as 
follows: SEQ ID NOs: 213 -214 are specific for genotype 
I/la; SEQ ID NOs: 215-216 are specific for genotype Il/lb; 
SEQ ID. NOs: 217-218 are specific for genotype III/2a; SEQ ID 
NOs: 219-220 are specific for genotype IV/2b; SEQ ID 
NOs:221-223 are specific for genotype 2c; SEQ ID NOs:224- 
226 are specific for genotype V/3a; SEQ ID NOs : 227-228 are 
35 specific for genotype 4a; SEQ ID NOs: 229-230 are specific 
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for genotype 4b; SEQ ID NOs: 231-232 are specific for 
genotype 4c; SEQ ID NOs: 233-234 are specific for genotype 
4d; SEQ ID NOs:235-236 are specific for genotype 5a and SEQ 
ID NOs: 237-239 are specific for genotype 6a. 

In another embodiment, the computer analysis of 
SEQ ID NOs: 103-154 by the program GENALIGN results in 
distribution of the 52 HCV core sequences into 14 genotypes 
based upon the degree of variation of the sequences. 

The grouping of SEQ ID NOs: 103 -154 into 14 HCV 
genotypes is shown below. 



SEP ID NOs; 




Genotvtses 


103-108 




I/la 


109-124 




Il/lb 


125-128 




III/2a 


129-133 




IV/ 2b 


134 




2c 


135-138 




V/3a 


139 




4a 


141 




4b 


143 




4c 


144 




4c 


145 




4d 


142 




4e 


140 




4f 


146-153 




5a 


154 




6a 



These 14 genotypes can be further grouped into 6 
major genotypes designated genotypes 1-6 where genotype 1 

25 comprises the sequences contained in minor genotypes I/la 
euid Il/lb; genotype 2 comprises the sequences contained in 
minor genotypes III/2a, IV/2b and 2c; genotype 3 comprises 
sequences contained in genotype V/3a; genotype 4 comprises 

; sequences contained in minor genotypes 4a-4f ; genotype 5 

30 comprises the sequences contained in genotype 5a and 

genotype 6 comprises the sequence contained in genotype 6a. 
Computer alignment of the constituent nucleotide sequences 
of the core cDNAs falling within genotypes I/la, Il/lb, 
III/2a, IV/2b, V/3a and 5a, to produce a consensus sequence 

35 for each of these genotypes is shown in Figures 6A (I/la), 
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6B (Il/lb), 6D {III/2a), 6E (IV/2b) , 6G (V/3a) and 61 {5a). 
The alignment of the sequences found in minor genotypes 
I /la and Il/lb to produce a consensus sequence for major 
genotype 1 is shown in Figure 6C. The alignment of the 
sequences contained in minor genotypes III/2a, IV/2b and 2c 
to produce a consensus sequence for major genotype 2 is 
shovm in Figure 6F. The alignment of the nucleotide 
sequences contained in minor genotypes 4a-4f to produce a 
consensus sequence for major genotype 4 is shown in Figure 
6H. Further alignment of the consensus sequences shown in 
Figures 6C, SF, 6G, 6H and 61 with SEQ ID NO: 154 (genotype 
6a/major genotype 6) to produce a consensus sequence for 
all genotypes is shown in Figure 6J and alignment of the 
consensus sequences shown in Figures 6A, 6B, 6D, 6E, 6G and 
61 with 4a), SEQ ID N0:141 (genotype 4b), SEQ ID NO:143 
(genotype 4c), SEQ ID NO:145 (genotype 4d) , SEQ ID NO:142 
(genotype 4e) , SEQ ID NO: 140 (genotype 4f) and SEQ ID 
NO: 154 (genotype 6a) to produce a consensus sequence for 
all fourteen genotypes is shown in Figure 6K. As with the 
alignments of the envelope (El) nucleotide sequences, the 
consensus sequences shown in Figures 6A-6K serve to 
highlight regions of homology and non-homology between 
sequences found within the same genotype or in different 
genotypes and hence, can be used by one skilled in the art 
to design oligonucleotides useful as reagents in diagnostic 

25 assays for HCV. 

For example,, .purified and isolated 

oligonucleotide sequences derived from the consensus 
sequences shown in Figures 6A-6K may be useful as genotype- 
specific primers and probes since these oligonucleotides 
can hybridize specifically to the nucleotide sequence of 
the core gene of HCV isolates belonging to a given 
genotype. Examples of regions of the consensus sequence of 
the core gene of a given genotype from which primers 
specific for that genotype may be deduced include but are 
not limited to, the nucleotide domains shown below for each 
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* genotype. The sequence in which the indicated nucleotide 
domains are found are indicated in parentheses to the right 
of each genotype. 

Genotype 1 (Conaensua Sequence of Figure 6C) 
427-466, 444-483, 447-486 (5' -3', sense) 
5 505-466, 522-483, 525-486 (5' -3', antisense) 

fienotvpe la fConsens ua Sequence of Figure 6A) 
141-180, 279-318 (5' -3', sense) 
219-180, 246-207 (5' -3', antisense) 

10 

Oenofcype lb (Consensus Sequence of Figure 6B) 

67-106, 127-186, 234-273 (5' -3', sense) 

144-106, 225-186, 311-272, 312-273 (5' -3', antisense) 

15 Genotype 2 (Consen sus Sequence of Figure 6F) 

153-192. 162-201, 164-203, 168-207, 171-210, 182-221, 192- 
231, 193-232, 302-341 (5' -3', sense) 

231-192, 240-201, 242-203, 246-207, 249-210, 260-221, 270- 
231, 271-232, 380-341 (5' -3', antisense) 

20 

Genotype I TT /2a (Consensus Sequence of Figure 6D) 
276-315, 306-355 (5' -3', sense) 

309-270, 354-315, 394-355, 571-532 (5' -3', antisense) 

25 Genotype TV/2b (Consensus Sequence of Figure 6E) 

-6-45, 135-174, 177-216, 309-348, 337-376, 375-414, 501-540 
(5' -3', sense) 

84-45, 213-174, 255-216, 387-348, 415-376, 453-414, 571- 
532, 573-540 (5' -3', antisense) 

30 

Genotype 2c (RP.O TP- NO; 134) 

194-233, 273-312, 279-318, 417-456, 423-462, 504-543, 505- 
544, 517-556 (5' -3', sense) 

272-233, 351-312, 354-315, 357-318, 450-411. 495-456, 501- 
462. 573-543, 556-573 (5' -3', antisense) 
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° Genotype 3 or Genotype V/3a (Consensus Sequence of Ficmre 
6G) 

8-47, 45-84, 68-107, 87-126, 88-127, 90-129, 111-150, 142- 
181, 173-212, 177-216, 261-300, 

276-315, 452-491, 520-559, 521-560, 529-568, 532-571, 533- 
5 572. {5' -3', sense) 

86-47, 123-84, 146-107, 165-126, 186-147, 189-150, 219-180, 
250-211, 251-212, 255-216, 

339-300, 530-491, 573-543, 573-557, 573-559, 573-560. (5'- 
3' , antisense) 

10 

Genotype 4 (Consensus Sequence of Figure 6H) 

20-59 (5' -3', sense) 

97-58, 98-59 (5' -3', antisense) 

15 Genotype 4a (SEP ID NO: 139) 

111-150, 150-189, 174-213, 183-222, 192-231, 261-300, 376- 
415, 396-435, 531-570 (5' -3', sense) 

186-147, 252-213, 270 -231, 339-300, 454-415 (5' -3', 
antisense) 

20 

Genotype 4b (SEP ID NO: 141) r\ 

27-66, 30-69, 106-145, 271-310, 433-472, 447-486, 453-492 
(5' -3' , sense) 

105-66, 183-144, 184-145, 345-306, 348-309, 349-310, 468- 
25 429, 510-471, 522-483, 570-531 (5' -3', antisense) 

Genotype 4c (SEP ID NO: 143 

174-213, 180-219, 207-246, 231-270 (5'-3', sense) 
249-210, 252-213, 258-219, 309-270, 504-465 (5' -3', 
3Q antisense) 

Genotype 4d (SEP ID NO: 145) 

173-212, 188-327, 430-469 (5' -3', sense) 

248-209, 249-210, 250-211, 251-212, 366-327, 508-469 (5'- 
3', antisense) 
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Genotype 4e (SEP ID NO: 14 2) 

160-199, 267-306, 287-326, 288-327, 524-564 {5'-3', sense) 
238-199, 345-306, 365-326, 216-177, 522-483 (5'-3', 
antisense) 

5 GenotVTPe 4f (SEP ID NO: 140) 

18-57, 36-75, 228-267, 396-435 (5'-3', sense) 
96-57, 114-75, 306-267 (5' -3', antisense) 

Genotype 5 or 5a (Consensus Sequence of Figure 61) 
10 176-215, 177-216, 181-220, 195-234, 221-260, 252-291, 255- 
294, 396-435, 435-474, 447-486, 498-537 (5'-3', sense) 
254-215, 299-260, 310-271, 330-291, 333-294, 354-315, 464- 
425, 471-432, 483-444, 570-531 (5' -3', antisense) 

15 Genotype 6 or 6a (SEP ID NO: 154) 

20-59, 136-175, 156-195, 159-198, 175-214, 185-224, 277- 
316, 278-317, 312-351, 348-387,405-444, 406-445, 407-446, 
408-447, 411-450, 432-471, 433-472, 435-474, 522-561 (5'- 
3' , sense) . 

20 98-59, 214-175, 234-195, 237-198, 253-214, 262-223, 263- 

224, 354-315, 355-316, 382-343^ 390-351, 426-387, 468-429, 
483-444, 484-445, 485-446, 486-447, 489-450, 510-471, 511- 
472, 513-474 (5' -3', antisense) 

Such nucleotide domains may range from about 15 

25 to about 100 bases in length with a more preferred range 
being about 30 to about 60 bases in length. 

In an alternative embodiment, universal primers 
able to hybridize to the nucleotide sequences of the core 
gene of HCV isolates belonging to all of the genotypes 

30 disclosed herein may be deduced from universally conserved 
nucleotide, domains of . the .consensus sequence shown in 
Figfures 6 J and 6K. Examples of such nucleotide domains 
include, but are not limited to, those shown below: 

nucleotides 1-20, 1-25, 1-26, 1-27, 1-33, 50-89, 

33 51-90, 52-91, 53-92, 61-100, 62-101, 77-116, 78-117, 79- 
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118, 80-119, 81-120, 82-121, 83-122, 84-123, 85-124, 86- 
125, 97-136, 98-137, 99-138, 100-139, 101-140, 102-141, 
329-368, 330-369, 331-370, 332-371, 354-393, 355-394, 356- 
395, 362-401, 363-402, 364-403, 365-404, 369-408, 442-481, 
443-482, 457-496, 458-497, 475-514, 476-515, 477-516 (5'- 

5 3, sense) ; and 

nucleotides 40-1, 41-2, 42-3, 43-4, 51-12, 52-13, 
55-16, 56-17, 57-18, 58-19, 61-22, 62-23, 63-24, 
64-25, 70-31, 124-85, 125-86, 126-87, 127-88, 128-89, 129- 
90, 136-97, 137-98, 138-99, 

10 149-110, 150-111, 151-112, 152-113, 153-114, 154-115, 155- 
116, 156-117, 157-118, 158-119, 159-120, 170-131, 171-132, 
172-133, 173-134, 174-135, 175-136, 403-364, 405-365, 406- 
366, 406-367, 430-391, 431-392, 432-393, 436-397, 437-398, 
438-399, 439-400, 517-478, 518-479, 519-480, 532-493, 533- 

15 494, 550-511, 551-512 (5' -3', antisense) 

Those skilled in the art would readily understand 
that the term "antisense" as used herein refers to primer 
sequences which are the complementary sequence of the 
indicated consensus sequence or SEQ ID NO:. Further, 

20 provided with the above examples of regions of the 

consensus sequences or indicated SEQ ID NQS: from which to 
deduce universal and genotype-specific primers, those 
skilled in the art would readily be able to select pairs of 
primers, one sense and one antisense, which would be useful 

25 in the detection of HCV genotypes via the PGR methods 

described herein 

In yet emother embodiment, the sequences shown in 
SEQ ID NO.: 103-154 and the resultant consensus sequences 
produced by alignment of these SEQ ID NOs as shown in 

30 Figures 6A-6K may also be useful in the design of 

hybridization probes specific for a given HCV genotype. 
Examples of nucleotide domains of the consensus sequence or 
SEQ ID NO of a given genotype from which genotype -specific 
hybridization probes may be deduced include, but are not 
limited to, those shown below where the sequence from which 
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* the domains are found is indicated in parentheses to the 

right of each genotype. 

Genotype Position 

la (Consensus sequence of Figure 6A) 50-85 

155-205 
207-277 

5 281-333 

429-477 
530-573 

lb (Consensus sequence of Figure 6B) 81-131 

159-225 
252-318 
411-472 

10 530-573 

2a (Consensus sequence of Figure 6D) 35-75 

200-276 
290-340 
330-380 
410-472 

15 530-573 

2b (Consensus sequence of Figure 6E) 20-70 

149-199 
191-241 
240-285 
261-318 
323-373 

20 ;\ . 351-401 

389-439 
429-477 
530-573 



25 



2c (SEQ ID NO:134) 208-258 

230-276 
290-345 
411-460 
430-490 
530-573 



3a (Consensus sequence of Figure 6G) 1-50 

40-100 
100-160 

30 145-190 

190^240. 
275-325 
411-455 
466-516 
530-573 



35 



4a (SEQ ZD NO: 139) 



35-85 
145-195 
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o 200-250 

255-305 
341-390 
390-440 
530-573 

4b (SEQ ID NO: 141) 35-85 

120-170 

5 180-225 

230-275 
285-335 
405-455 
462-492 
530-573 

10 4c (SEQ ID NO: 143) 35-85 

190-246 
245-295 
282-318 
372-415 
440-480 
530-573 

15 4d (SEQ ID NO: 145) 35-85 

187-237 
302-352 
405-455 
444-494 
530-573 

on 4e (SEQ ID NO: 142) 35-85 

,n 57-84 

'n!74-224 
230-275 
290-340 
422-472 
530-573 

25 4f (SEQ ID NO: 140) 35-85 

174-224 
242-292 
290-340 
422-472 
530-573 

■jft 5a (Consensus sequence of Figure 61) 180-234 

265-315 
315-355 
420r486 
530-573 



35 



6a (SEQ ID NO: 154) 



34-84 

150-200 

180-230 
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' 230-290 

291-333 
341-395 
429-490 
530-573 

1 (Consensus sequence of Figure SO 192-241 
^ 435-495 

2 (Consensus sequence of Figure 6F) 186-240 

320-360 
440-475 

4 (Consensus sequence of Figure 6H) 4 0-80 

10 In yet another embodiment, universal 

hybridization probes may be derived from the consensus 
sequences shown in Figures 6 J and 6K. Examples of 
nucleotide domains of the consensus sequences shovm in 
Figure 6J and 6K from which universal hybridization probes 

15 may be derived include, but are not limited to, 1-33; 85- 
141; 364-408; 478-516, 

The oligonucleotides of this invention can be 
synthesized using any of the known methods of 
oligonucleotide synthesis (e.g., the phosphodiester method 

20 of Agarwal et al. 1972, Agnew. Chem. Int. Ed. Engl. 11:451, 
^ the phosphotriester method of ^Hsiung et al. 1979, Nucleic 
Acids Res 6s 1371, or the automated diethylphosphoramidite 
method of Baeucage et al. 1981, Tetrahedron Letters 
22:1859-1862), or they can be isolated fragments of 

25 naturally occurring or cloned DNA. In addition, those 

skilled in the art would- be aware that oligonucleotides can 
be synthesized by automated instruments sold by a variety 
of manufacturers or can be commercially custom ordered and 
prepared. In a preferred embodiment, the oligonucleotides 

3Q of the present invention are synthetic oligonucleotides. 
The oligonucleotides of the present invention may range 
from about 15 to about 100 nucleotides; with the preferred 
sizes being about 20 to about 60 nucleotides; a more 
preferred size being about 25 to about 50 nucleotides; and 

35 a most preferred size being about 30 to about 4 0 
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nucleotides. 

The present invention also relates to methods for 
detecting the presence of HCV in a mammal, said methods 
comprising analyzing the RNA of a mammal for the presence 
of hepatitis C virus. 
S The RNA to be analyzed can be isolated from 

serum, liver, saliva, lymphocytes or other mononuclear 
cells as viral RNA, whole cell RNA or as poly (A)* RNA. 
Whole cell RNA can be isolated by methods known to those 
skilled in the art. Such methods include extraction of RNA 

10 by differential precipitation (Birnbiom, H.C. (1988) 

Nucleic Acids Res., 16:1487-1497), extraction of RNA by 
organic solvents (Chomczynski, P. et al. (1987) Anal. 
Biochem., 162:156-159) and extraction of RNA with strong 
denaturants (Chirgwin, J.M. et al. (1979) Biochemistry, 

15 18:5294-5299). Poly (A) ^ RNA can be selected from whole cell 
RNA by affinity chromatography on oligo-d(T) columns (Aviv, 
H. et al. (1972) Proc. Natl. Acad. Sci., 69:1408-1412). A 
preferred method of isolating RNA is extraction of viral 
RNA by the guanidinium-phenol- chloroform method of Bukh et 

20 al. (19,92a). ^ 

The methods for analyzing the RNA for the 
presence of HCV include Northern blotting (Alwine, J.C. et 
al. (1977) Proc. Natl. Acad. Sci., 74:5350-5354), dot and 
slot hybridization (Kafatos, F.C. et al. (1979) Nucleic 

25 Acids Res., 7:1541-1522), filter hybridization (Hollander, 
M.C. et al. (1990) Biotechniques; 9:174-179) , RNase 
protection (Sambrook, J. et al. (1989) in "Molecular 
Cloning, A Laboratory Manual", Cold Spring Harbor Press, 
Plainview, NY) and reverse -transcript ion polymerase chain 

30 reaction (RT-PCR) (Watson, J.D. et al. (1992) in 

"Recombinant DNA" Second Edition-, W. H . Freeman and Company ; 
New York) . 

A preferred method for analyzing the RNA is RT- 
PCR. In this method, the RNA can be reverse transcribed to 
35 first strand cDNA using a primer or primers derived from 



wo 96/05315 



PCT/US95/10398 



- 47 - 

® the nucleotide sequences shown in SEQ ID N0s:l-51 or SEQ ID 
NOs: 103-154 or sequences complementary to those described. 
Once the cDNAs are synthesized, PGR amplification is 
carried out using pairs of primers designed to hybridize 
with sequences in the HCV El or core cDNA which are an 
5 appropriate distance apart (at least about 50 nucleotides) 
to permit amplification of the cDNA and subsequent 
detection of the amplification product. Alternatively, one 
can amplify both El and core cDNA sequences by using a 
primer pair where one primer hybridizes with the El cDNA 

10 sequence and the other primer hybridizes with the core cDNA 
sequence. Each primer of a pair is a single -stranded 
oligonucleotide of about 20 to about 60 bases in length 
with a more preferred range being about 30 to about 50 
bases in length where one primer (the "upstream" primer) is 

13 complementary to the original RNA and the second primer 
(the "downstream" primer) is complementary to the first 
strand of cDNA generated by reverse transcription of the 
RNA, The target sequence is generally about 100 to about 
300 base pairs long but can be as large as 500-1500 base 

20 pairs. Optimization of the amplification reaction to 
obtain sufficiently specific ^hybridization to the 
nucleotide sequence of interest (either El or core or both 
El and core) is well within the skill in the art and is 
preferably achieved by adjusting the annealing temperature. 

25 In one embodiment, the primer pairs selected to 

amplify El and core cDNAs are universal primers. By 
"universal", as used to describe primers throughout the 
claims and specification, is meant those primer pairs which 
can amplify El and/or core gene fragments derived from an 

3Q HCV isolate belonging to any one of the genotypes of HCV 

described herein. Purified and isolated universal primers 
for El cDNAs are used in Example 1 of the present invention 
and are shown as SEQ ID NOs: 207-212 where SEQ ID NOs: 207 
and 208 represent one pair of primers, SEQ ID NOs: 209 and 

« 210 represent a second pair of primers and SEQ ID NOs : 211- 
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® 212 represent a third pair of primers. Nucleotide domains 
of the consensus sequence shown in Figure 6J from which 
universal primers for core cDNAs may be deduced have 
previously been disclosed within the present specification. 
Alternatively, a universal primer for El cDNA sequence and 

5 a universal primer for core cDNA sequence may be used as a 
universal primer pair to amplify both El and core cDNAs. 

In an alternative embodiment, primer pairs 
selected to amplify El and/or core cDNAs are genotype- 
specific primers. In the present invention, genotype- 

10 specific primer pairs can readily be derived from the 
following genotype- specif ic El nucleotide domains: 
nucleotides 197-238 and 450-480 of the consensus sequence 
of genotype I /la shown in Figure lA; nucleotides 197-238 
and 450-480 of the consensus sec[uence of genotype Il/lb 

15 shown in Figure IB; nucleotides 199-238 and 438-480 of the 
consensus sequence of genotype III/2a shown in Figure C; 
nucleotides 124-177 and 450-480 of the consensus sequence 
of genotype IV/ 2b shown in Figure ID; nucleotides 124-177, 
193-238 and 436-480 of SEQ ID NO:34 (genotype 2C) ; 

20 nucleotides 168-207, 294-339 and 406-480 of the consensus 
sequence of genotype V/3a shown in Figure IE; nucleotides 
145-183 and 439-480 of SEQ ID NO:40 (genotype 4a); 
nucleotides 168-207 and 432-480 of SEQ ID N0:41 (genotype 
4b) ; nucleotides 130-183 and 450-480 of the consensus 

25 sequence of genotype 4c shown in Figure IF; nucleotides 
130-183 and 450-480 of SEQ ID NO: 44 (genotype 4d) ; 
nucleotides 166-208 and 437-480 of the consensus sequence 
of genotype 5a shown in Figure lb and nucleotides 168-207, 
216-252 and 429-480 of SEQ ID N0:51 (genotype 6a). 

30 Genotype-specific HCV core nucleotide domains from which 
genotype--specific.primers.may.be deduced have previously, 
been described herein. Those skilled in the art would 
readily appreciate that in a pair of genotype-specific 
primers, each primer is derived from different nucleotide 

35 domains specific for a given genotype. Also, it is 
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understood by those skilled in the art that each pair of 
primers comprises one primer which is complementary to the 
original viral RNA and the other which is complementary to 
the first strand of cDNA generated by reverse transcription 
of the viral RNA. For example, in a pair of genotype- 
5 specific primers for genotype 4b, one primer would have a 
nucleotide sequence derived from region 168-207 of SEQ ID 
NO: 40 and the other primer would have a nucleotide sequence 
which is the complement of region 432-480 of SEQ ID NO:40. 
One skilled in the art would readily recognize that such 
10 genotype -specific domains would also be useful in designing 
oligonucleotides for use as genotype-specific hybridization 
probes. Indeed, genotype -specific hybridization probes 
deduced from the El and core sequences of the present 
invention have been previously disclosed herein. 
25 The amplification products of PGR can be detected 

either directly or indirectly. In one embodiment, direct 
detection of the amplification products is carried out via 
labelling of primer pairs. Labels suitable for labelling 
the primers of the present invention are known to one 
skilled in the art and include radioactive labels, biotin, 
avidin, enzymes and fluorescent molecules. The derived 
labels can be incorporated into the primers prior to 
performing the amplification reaction. A preferred 
labelling procedure utilizes radiolabeled ATP and T4 
polynucleotide kinase (Sambrook, J. et al . (1989) in 
"Molecular Cloning, A Laboratory Manual", Cold Spring 
Harbor Press, Plainview, NY) . Alternatively, the desired 
label can be incorporated into the primer extension 
products during the amplification reaction in the form of 
one or more labelled dNTPs. In the present invention, the 
labelled amplified PGR products can be detected by agarose 
gel electrophoresis . followed by ethidum bromide staining^ 
and visualization under ultraviolet light or via direct 
sequencing of the PCR-products . Thus, in one embodiment , 
35 the present invention relates to a method for determining 
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the genotyp of a hepatitis C virus present in a mammal 
where said method comprises: amplifying RNA of a mammal 
via RT-PCR using labelled genotype-specific primers for the 
amplification step of the cDNA produced by reverse 

transcription. 

5 In yet another embodiment, unlabelled 

amplification products can be detected via hybridization 
with labelled nucleic acid probes radioactively labelled 
or, labelled with biotin, in methods known to one skilled 
in the art such as dot and slot blot hybridization 

10 (Kafatos, F.C. et al. (1979) or filter hybridization 
(Hollander, M.C. et al- (1990)). 

In one embodiment, the nucleic acid sequences 
used as probes are selected from, and substantially 
homologous to, SEQ ID N0s:l-51 and/or SEQ ID NOs: 103-154. 

15 Such probes are useful as universal probes in that they can 
detect PCR-amplif ication products of El and/or core cDNAs 
of an HCV isolate belonging to any of the HCV genotypes 
disclosed herein. The size of these probes can range from 
about 200 to about 500 nucleotides. In an alternative 

20 embodiment, the sequence alignments shown in Figures lA-lH 
and 6A-6J may be used to design oligonucleotides useful as 
universal hybridization probes. Examples of core and 
envelope nucleotide domains from which such universal 
oligonucleotides may be deduced are disclosed herein. 

2j In yet another embodiment, the present invention 

relates to a method f or determining the -genotype of a~ 
hepatitis C virus present in a mammal where said method 
comprises: 

(a) amplifying RNA of a mammal via RT-PCR to 
3Q produce amplification products; 

(b) contacting said products- with at least one- 
genotype-specific oligonucleotide; and 

(c) detecting complexes of said products which 
bind to said oligonucleotide (s) . 

In this method, one embodiment of said 
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* amplification step is carried out using the universal 

primers for El or core cDNAs as disclosed above . In step 
(b) of this method, the genotype- specif ic sequences used as 
probes may be deduced from the genotype -specific El and 
core nucleotide domains disclosed herein. These probes are 
5 useful in specifically detecting PCR-amplif ication products 
of El or core cDNAs of HCV isolates belonging to one of the 
HCV genotypes disclosed herein. In a preferred embodiment, 
these probes are used alone or in combination with other 
probes specific to the same genotype. 

10 For example, a probe having a seq^ience according 

to SEQ ID NO: 213 can be used alone or in combination with a 
probe having a sequence according to SEQ ID NO: 214. The 
probes used in this method can range in size from about 15 
to about 100 nucleotides with a more preferred range being 

15 about 30 to about 70 nucleotides. Such probes can be 
synthesized as described earlier. 

In an alternative embodiment, the genotype of the 
amplification product of step (a) may be determined by 
using the nucleic acid sequences shown in SEQ ID NOs: 1-51 

20 and 103-154 as probes (Delwart, E. et al. (1993)) Science. 

^ 262: 1257-1261). Probes utilized in the method of Delwart 

et al, may rainge in size from about 100 to about 1,000 
nucleotides with a more preferred probe size being about 
200 to about 800 base pairs and a most preferred probe size 

25 being about 300 to about 700 nucleotides. 

The nucleic acid sequence used as a probe to 
detect PGR amplification products of the present invention 
can be labeled in single -stranded or double -stranded form. 
Labelling of the nucleic acid sequence can be carried out 

30 techniques known to one skilled in the art. Such 

labelling techniques can include radiolctbels and enzymes 
(Sambrook, J. et al. {1989) in "Molecular Cloning, A 
Laboratory Manual", Cold Spring Harbor Press, Plainview, 
New York) . In addition, there are known non-radioactive 

35 techniques for signal amplification including methods for 
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** attaching chemical moieties to pyrimidine and purine rings 
(Dale, R.N.K. et al. (1973) Proc. Natl. Acad. Sci . . 
70:2238-2242; Heck, R.F. (1968) S. Am. Chem, Soc , 90:5518- 
5523) , methods which allow detection by chemiluminescence 
(Barton, S,K. et al. (1992) J. Am. Chem. Soc. . 114:8736- 
5 8740) and methods utilizing biotinylated nucleic acid 
probes (Johnson, T.K. et al. (1983) Anal , Biochem. , 
133:126-131; Erickson, P.F. et al . (1962) J. of Immunology 
Methods . 51:241-249; Matthaei, F.S. et al . (1986) Anal . 
Biochem. . 157:123-128) and methods which allow detection by 

10 fluorescence using commercially available products. 

The present invention also relates to computer 
analysis of the amino acid sequences shown in SEQ ID 
NOs: 52-102 by the program GENALI6N. This analysis groups 
the 51 amino acid sequences shown in SEQ ID NOs : 52-102 into 

15 twelve genotypes based upon the degree of variation of the 
amino acid sequences. For the purposes of the present 
invention, the amino acid sequence identity of El amino 
acid sequences of the same genotype ranges from atbout 85% 
to about 100% whereas the identity of El amino acid 

20 sequences of different genotypes ranges from about 45% to 
about^'^80%. 

The grouping of SEQ ID NOs: 52-102 into twelve HCV 
genotypes is shown below: 



25 



30 



SEO ID NOs: 


Qenptyp^s 


52-59 


I/la 


60-76 


Il/lb 


77-80 


III/2a 


81-84 


IV/ 2b 


85 


2c 


86-90 


V/3a 


91 


4a 


92 


4b 


93-94:: 


4c 


95 


4d 


96-101 


5a 


102 


6a 



35 



For those genotypes containing more than one El 
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^ amino acid sequence, computer alignment of the constituent 
sequences of each genotype was conducted using the computer 
program GENALIGN in order to produce a consensus sequence 
for each genotype. These alignments and their resultant 
consensus sequences are shown in Figures 2A-G for the seven 
5 genotypes (I/la, Il/lb, III/2a, IV/2b, V/3a, 4c and 5a) 

which comprise more than one sequence. Further alignment 
of the consensus sequences shown in Figures 2A-G with the 
amino acid sequences of SEQ ID NO: 85 (genotype 2c); SEQ ID 
N0:91 (genotype 4a); SEQ ID NO:92 (genotype 4b); SEQ ID 
10 NO: 95 (genotype 4d) and SEQ ID NO: 102 (genotype 6a) to 
produce a consensus amino acid sequence for all twelve 
genotypes is shown in Figure 2H. The multiple alignment of 
El amino acid sequences shown in Figures 2A-H produces 
consensus sequences which serve to highlight regions of 
homology and non-homology between El amino acid sequences 
of the same genotype and of different genotypes and hence, 
these alignments can readily be used by those skilled in 
the art to design peptides useful in assays and vaccines 
for the diagnosis and prevention of HCV infection. 

In another embodiment, the computer analysis of 
SEQ ID NOS: 155-206 by the ^probe genome results in 
distribution of the 52 HCV core sequences into 14 genotypes 
based upon identification of genotype -specific amino acid 
sequences . 

The grouping of SEQ ID NOS: 155-206 into 14 HCV 
genotypes is shown below: 



15 



20 

A 



25 



30 



35 



SEO ID NOS: 


Genotvoes 


155-160 


I/la 


161-176 


Il/lb 


177-180 


III/2a 


181t185 


IV/2b 


186 


2c . 


187-190 


V/3a 


191 


4a 


193 


4b 


195 


4c 


196 


4c 


197 


4d 
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o 194 4e 
192 4f 
198-205 5a 
206 6a 

These fourteen genotypes can be further grouped 
into six major genotypes designated genotypes 1-6 as 
^ described earlier for the core nucleotide sequences of the 
present application. Computer alignment of the amino acid 
sequences disclosed in SEQ ID NOS: 155-206 are shown in 
figures 7A-7J. As with the multiple alignments of the E-1 
amino acid sequences, the consensus sequences shown in 
^® figure 7A-7J serve to highlight regions of homology and 

nonhomology between core amino acid sequences of the same 
genotype and of different genotypes and hence, these 
alignments can readily be used by those skilled in the art 
to design peptides useful in assays and vaccines for the 
15 diagnosis and prevention of HCV infection. 

Examples of purified and isolated peptides 
deduced from the alignments shown in Figures 2A-2H include, 
but are not limited to, SEQ ID NOs: 240-263 wherein these 
peptides are derived from two regions of the amino acid 
20 sequences shown in Figures 2A-H, amino acids 48-80 and 
amino acids 138-160. The peptides shown in SEQ ID NOs. 
240-263 are useful as genotype -specific diagnostic reagents 
since they are capable of detecting an immune response 
specific to HCV isolates belonging to a single genotype. 
25 The genotype- specificity of the peptides shown in SEQ ID 
NOs:240-263 are as follows: SEQ ID NOs:240 and '252 are 
specific for genotype IV/2b; SEQ ID NOs: 241 and 253 are 
specific for genotype 2c; SEQ ID N0s:242 and 254 are 
specific for genotype III/2a; SEQ ID NOs: 243 and 255 are 
30 specific for genotype V/a; SEQ ID NOs: 244 and 256 are 

specific for genotype Il/lb; SEQ: ID NOs: 245 and 257 are 
specific for genotype I/la; SEQ ID NOs: 246 and 258 are 
specific for genotype 4a; SEQ ID NOs:247 and 259 are 
specific for genotype 4c; SEQ ID NOs:248 and 260 are 
35 specific for genotype 4d; SEQ ID N0s:249 and 261 are 
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• specific for genotype 4b; SEQ ID N0s:250 and 262 are 

specific for genotype 5a and SEQ ID NOs:251 and 263 are 
specific for genotype 6a. In SEQ ID NO: 240, Xaa at 
position 22 is a residue of Ala or Thr, Xaa at position 24 
is a residue of Val or He, Xaa at position 26 is a residue 
5 of Val or Met; in SEQ ID NO: 242, Xaa at position 5 is a Ser 
or Thr residue, Xaa at position 11 is an Arg or Gin 
residue, Xaa at position 12 is an Arg or Gin residue; in 
SEQ ID NO: 243, Xaa at position 3 is a Pro or Ser residue, 
Xaa at position 33 is a Leu or Met residue; in SEQ ID 
10 NO: 244, Xaa at position 5 is a Thr or Ala residue, Xaa at 
position 13 is a Gly, Ala, Ser, Val or Thr residue, Xaa at 
position 14 is a Ser, Thr or Asn residue, Xaa at position 
15 is a Val or He residue, Xaa at position 16 is a Pro or 
Ser residue, Xaa at position 18 is a Thr or Lys residue, 
15 Xaa at position 19 is a Thr or Ala residue, Xaa at position 
22 is an Arg or His residue, Xaa at position 32 is an Ala, 
Val or Thr residue; in SEQ ID NO: 245, Xaa at position 3 is 
an Ala or Pro residue, Xaa at position 4 is a Val or Met 
residue, Xaa at position 5 is a Thr or Ala residue, Xaa at 
20 position 17 is a Thr or Ala residue, Xaa at position 18 is 
^ a Thr or Ala residue, Xaa /at position 23 is a His or Tyr 

residue; in SEQ ID NO: 247, Xaa at position 10 is a Val or 
Ala residue, Xaa at position 11 is a Ser or Pro residue, 
Xaa at position 18 is an Asp or Glu residue Xaa at position 
25 20 is a Leu or He residue; in SEQ ID NO: 250, Xaa at 

position 3 is a Gin or His residue, Xaa at position 12 is 
an Asn, Ser or Thr residue, Xaa at position 13 is a Leu or 
Phe residue, Xaa at position 23 is an Ala or Val residue; 
in SEQ ID NO: 252, Xaa at position 16 is a Val or Ala 
3Q residue, Xaa at position 18 is a Glu or Gin residue; in SEQ 
ID NO: 254, Xaa at position 2 is an Ala or Thr residue, Xaa 
at position 4 is a Met or Leu residue, Xaa at position 9 is 
an Ala or Val residue, Xaa at position 17 is an He or Leu 
residue, Xaa at position 20 is an He or Val residue, Xaa 
at position 21 is a Ser or Gly residue; in SEQ ID NO: 151, 
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* Xaa at position 9 is a Val or lie residue, Xaa at position 
16 is a Leu or Val residue, Xaa at position 20 is an lie or 
Leu residue; in SEQ ID NO: 256, Xaa at position 2 is an Ala 
or Thr residue r Xaa at position 6 is a Val or Leu residue, 
Xaa at position 12 is an lie or Leu residue, Xaa at 
5 position 16 is a Val or He residue, Xaa at position 17 is 
a Val, Leu or Met residue, Xaa at position 19 is a Met or 
Val residue, Xaa at position 21 is an Ala or Thr residue; 
in SEQ ID NO: 257, Xaa at position 2 is a Thr or Ala 
residue, Xaa at position 6 is a Val, He or Met residue, 
10 Xaa at position 12 is an He or Val residue, Xaa at 

position 16 is a He or Val residue; in SEQ ID NO: 155, Xaa 
at position 5 is a Leu or Val residue, Xaa at position 21 
is a Thr or Ala residue; in SEQ ID NO: 262, Xaa at position 
1 is a Thr or Ala residue, Xaa at position 5 is a Val or 
Leu residue, Xaa at position 9 is a Leu, Met or Val 
residue, Xaa at position 23 is a Gly or Ala residue. 

Examples of core amino acid domains from which 
genotype-specific peptides may be deduced, include but are 
not limited to, those shown below where the sequence in 
which the indicated domains are found is given in 
parentheses to the right of each genotype: 

Genotype Amino Acid Domains 



15 



20 



25 



30 



35 



la 


(consensus 


secjuence 


of 


Figure 


7A) 


67-78 


lb 


(consensus 


sequence 


of 


Figure 


7B) 


67-78 


2 


(consensus 


sequence 


of 


Figure 


7F) 


66-81 










110-119 


2a 


(consensus 


sequence 


of 


Figure 


7D) 


67-78 












115-125 


2b 


(consensus 


sequence 


of 


Figure 


7E) 


67-78 












123-133 


2c 


(SEQ ID NO: 


:186) 








67-78 












75-81 
184-191 


3a 


( consensus 


sequence 


of 


Figure 


7G) 


8-22 










32-46 
67-78 
158-170 
180-191 


4 


(consensus 


sequence 


of 


Figure 


7H) 


14-23 


4a 


(SEQ ID NO 


:191) 








67-78 


4b 


(SEQ ID NO 
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o 67-78 
4c (SEQ ID NO:195) 67-78 
4d (SEQ ID NO: 197) 67-78 
4e (SEQ ID NO: 194) 67-78 
4f (SEQ ID NO: 192) 67-78 
5a (consensus sequence of Figure 7J) 67-78 
6a (SEQ ID NO:206) 67-78 

101-108 

5 144-155 

157-163 

Those skilled in the art would be aware that the 
peptides of the present invention or analogs thereof can be 
synthesized by automated instruments sold by a variety of 
*® manufacturers or can be commercially custom- ordered and 
prepared. The term analog has been described earlier in 
the specification and for purposes of describing the 
peptides of the present invention, analogs can further 
include branched, cyclic or other non-linear arrangements 
15 of the peptide sequences of the present invention. 

Alternatively, peptides can be expressed from 
nucleic acid sequences where such sequences can be DNA, 
cDNA, RNA or any variant thereof which is capable of 
directing protein synthesis. In one embodiment, 
20 restriction digest fragments containing a coding sequence 

for a peptide can be inserted into a suitable expression A 
vector that functions in prokaryotic or eukaryotic cells. 
Such restriction digest fragments may be obtained from 
clones isolated from prokaryotic or eukaryotic sources 
25 which encode the peptide sequence. 

Suitable expression vectors and methods of 
isolating clones encoding the peptide sequences of the 
present invention have previously been described. In yet 
another embodiment, an oligonucleotide capable of directing 
30 host organism synthesis of the given peptide may be 
synthesized and inserted - into the expression, vector . 

The preferred size of the peptides of the present 
invention is from about 8 to about 100 amino acids in 
length when the peptides are chemically synthesized with a 
35 more preferred size being about 8 to about 30 amino acids 
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** and a most preferred size being about 10 to about 20 amino 
acids in length. For recombinantly expressed peptides, the 
size may range from cQ^out 20 to about 190 amino acids in 
length with a more preferred size being about 70 amino 
acids . 

5 The present invention further relates to the use 

of genotype -specific peptides in methods of detecting 
antibodies against a specific genotype of HCV in biological 
samples. In one embodiment^ at least one genotype-specific 
peptide deduced from a genotype-specific core or El amino 

10 acid domain may be used in any of immunoassays described 
herein to detect antibodies specific for a single genotyp 
of HCV. In smother embodiment, at least one genotype- 
specific peptide deduced from a genotype -specific core 
nucleotide domain and at least one genotype -specific 

IS peptide deduced from an £1 amino acid domain may be used in 
an immunoassay to detect antibodies against a single 
genotype of HCV. A preferred immunoassay is ELISA. 

It is understood by those skilled in the art that 
the diagnostic assays described herein using genotype- 

20 specific oligonucleotides or genotype-specific peptides can 
be 'Aiseful in assisting one skilled in the art to choose a 
course of therapy for the HCV- infected individual. 

In an alternative embodiment, a mixture of 
genotype -specific peptides can be used in an immunoassay to 

25 detect antibodies against multiple genotypes of HCV 

disclosed herein. For example, a mixture of genotype- 
specific peptides deduced from El amino acid sequences may 
comprise at least one peptide selected from SEQ ID NOs:244- 
245 and 256-257; one peptide selected from SEQ ID N0s:240, 

30 242, 252 and 254; one peptide selected from SEQ ID NOs:246- 
249 and 258-261;. one peptide selected from SEQ. ID NOs:250 
and 262; one peptide selected from SEQ ID N0s:243 and 255; 
one peptide selected from SEQ ID NOs:242 and 254 and one 
peptide selected from SEQ ID NOs:244 and 263. In a 

35 preferred embodiment, the peptid s of the present invention 
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can be used in an ELISA assay as described previously for 
recombinant El and core proteins. 

In an alternative embodiment, the peptide (s) 
utilized in an immunoassay to detect all the genotypes of 
HCV disclosed herein may be a universal peptide deduced 
5 from universally conserved amino acid domains of the El or 
core proteins disclosed herein. 

Examples of universally conserved core amino acid 
domains within the consensus sequence shovm in Figure 7J 
from which universal peptides may be deduced include, but 
10 are not limited to amino acid domains 23-35, 53-66, 93-108, 
122-138, 150-156, and 165-181 of the consensus sequence - 
Examples of universally conserved El amino acid domains 
within the HCV El protein are located within the consensus 
sequence for the 51 HCV El proteins shown in Figure 2H of 
15 the present application. Examples of universally conserved 
domains within the consensus sequence shown in Figure 2H 
include, but are not limited to, amino acid domains 10-20, 
111.120, and 124-137 of the consensus sequence. The 
universal peptides of the present invention may be used in 
^ 20 an immunoassay to detect ^antibodies in patient sera 

specific for any of the genotypes of HCV disclosed herein^ 
The peptides of the present invention or analogs 
thereof may be prepared in the form of a kit, alone or in 
combinations with other reagents such as secondary 
25 antibodies, for use in immunoassay. 

In another embodiment, the genotype -specific and 
universal peptides of the present invention may be used to 
produce antibodies that will react against HCV El or core 
proteins in immunoassays. In one embodiment, a genotype- 
30 specific El or core peptide can be used alone or in 

combination with: other El or- core peptides specific to the 
same genotype as immunogens to produce antibodies specific 
to HCV proteins of a single genotype. 

In another embodiment, a mixture of peptides 
3* specific for different genotypes may be used to produce 
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** antibodies that will react with HCV proteins of any 

genotype disclosed herein. More preferably, antibodies 
reactive with HCV proteins of any genotype may be produced 
by immunizing an animal with universal peptide (s) of the 
present invention. Examples of immunoassays in which such 
5 antibodies could be utilized to detect HCV El and core 

proteins in biological samples include, but are not limited 
to, radioimmunoassays and ELISAs. Examples of biological 
samples in which HCV El and core proteins could be detected 
includes, but it is not limited to, serum, saliva and 
10 liver. 

Of course, those skilled in the art would readily 
understand that the genotype -specific and universal 
peptides of the present invention and expression vectors 
containing nucleic acid sequence capable of directing host 

IS organism synthesis of these peptides could also be used as 
vaccines against hepatitis C. Formulations suitable for 
administering the peptide (s) and expression vectors of the 
present invention as immunogen, routes of administration, 
pharmaceutical compositions comprising the peptides 

20 expression vectors and so forth are the same as those 

previously described for recombinant ^ El and core proteins. 

The genotype -specific and universal peptides of 
the present invention and expression vectors containing 
nucleic acid sequence capable of direct host organism 

25 synthesis of these peptides may also be supplied in the 
form of a kit, alone, or in the form of a pharmaceutical 
composition as described above for recombinant El and core 
proteins . 

Any articles or patents referenced herein are 
30 incorporated by reference. The following examples 

illustrate various -aspects of the invention^ but are in no 
way intended to limit the scope thereof. 



35 
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MATERIAL? 

Serum used in these examples was obtained from 
84 anti-HCV positive individuals who were previously found 
to be positive for HCV RNA in a cDNA PGR assay with primer 
set a from the 5' NC region of the HCV genome (Bukh, J. et 
5 al. (1992 (b)) Proc. Natl. Acad. Sci. USA 89:4942-4946) . 
These samples were from 12 countries: Denmark (DK) ; 
Dominican Republic (DR) ; Germany (D) ; Hong Kong (HK) ; India 
(IND) ; Sardinia, Italy (S) ; Peru (P) ; South Africa (SA) ; 
Sweden (SW) ; Taiwan (T) ; United States (US) ; and Zaire (Z) . 

10 

Example 1 

Identification of the cDNA Sequence 
of the El Gene of 51 Isolates of HCV via 
RT-PCR Analysis of Viral RNA Using Universal Primers 

15 Viral RNA was extracted from 100 (il of serum by 

the guanidinium-phenol-chloroform method and the final RNA 
solution was divided into 10 equal aliquots and stored at 
-80**C as described (Bukh, et al - (1992 (a)). The sequences 
of the synthetic oligonucleotides used in the RT-PCR assay, 

20 deduced from the secpience of HCV strain H-77 (Ogata, N. et 
al. (1991) Proc. Natl. Acad. Sci. USA 88:3392-3396), are i'] 
shown as SEQ ID NOs:207-212. One aliquot of the final RNA 
solution, equivalent to 10 [il of serum, was used for cDNA 
synthesis that was performed in a 20 /il reaction mixture 

25 using avian myeloblastosis virus reverse transcriptase 

(Promega, Madison, WI) and SEQ ID N0:208 as a primer. The 
resulting cDNA was amplified in a "nested" PCR assay by Tag 
DNA polymerase (Amplitaq, Perkin-Elmer/Cetus) as described 
previously (Bukh et al. (1992a)) with primer set e (SEQ ID 

3Q NOs: 207-210) . Precautions were taken to avoid 

contamination with exogenous HCV nucleic acid (Bukh et al . 
1992a)), and negative controls (normal^ uninfected serum) 
were interspersed between every test sample in both the RNA 
extraction and cDNA PCR procedures. No false positive 

35 results were observed in the analysis. In most instances, 
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** amplified DNA (first or second PGR products) was 

reamplified with primers SEQ ID NO: 211 and SEQ ID NO: 212 
prior to sequencing since these two primers contained EcoRl. 
sites which would facilitate future cloning of the El gene. 
Amplified DNA was purified by gel electrophoresis followed 
5 by glass-milk extraction (Geneclean, BIO 101, LaJolla, CA) 
and both strands were sequenced directly by the dideoxy- 
nucleotide chain termination method (Bachman, B. et al. 
(1990) Nucl* Acids Res. 18:1309)) with phage T7 DNA 
polymerase (Sequenase, United States Biochemicals, 

10 Cleveland, OH), [alpha "SjdATP (Amersham, Arlington 
Heights, IL) or [alpha ^^P] dATP (Amersham or DuPont, 
Wilmington, DE) and sequencing primers. RNA extracted from 
serum containing HCV strain H-77, previously sequenced by 
Ogata, N. et al. (1991), was amplified with primer set e 

IS (SEQ ID NOs: 207-210) and sequenced in parallel as a 

control. The nucleotide sequences of the envelope 1 (El) 
gene of all 51 HCV isolates are shown as SEQ ID NOs:l - 51. 
In all 51 HCV isolates, the El gene was exactly 576 
nucleotides in length and did not have any in- frame stop 

20 codons . 

Example ? 

Computer Analysis of the Nucleotide 
and Deduced Amino Acid Sequences 
of the El Gene of 51 HCV Isolates 

25 

Multiple computer-generated alignments of the. . 
nucleotide (SEQ ID NOs: 1-51, Figures lA-H) and deduced 
amino acid sequences (SEQ ID NOs: 52-102, Figures 2A-H) of 
the cDNAs of the 51 HCV isolates constructed using the 
30 computer program GENALIGN (Miller, R.H. et al. (1990) Proc. 
Natl. Acad. Sci. USA 87:2057-2061) resulted in the 51 HCV 
isolates being divided, into twelve genotypes based upon the 
degree of variation of the El gene sequence as shown in 
table 1. 



35 
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" The nucleotide and amino acid sequence identity 

of HCV isolates of the same genotype was in the range of 
88.0-99.1% and 89.1-98.4%, respectively, whereas that of 
HCV isolates of different genotypes was in the range of 
53-5-78.6% and 49.0-82.8%, respectively. The latter 
5 differences are similar to those found when comparing the 
envelope gene sequences of the various serotypes of the 
related f laviviruses, as well as other RNA viruses. When 
microheterogeneity in a sequence was observed, defined as 
more than one prominent nucleotide at a specific position, 

10 the nucleotide that was identical to that of the HCV 
prototype (HCVl, Choo et al. (1989)) was reported if 
possible. Alternatively, the nucleotide that was identical 
to the most closely related isolate is shown. 

Analysis of the consensus secjuence of the El 

15 protein of the 51 HCV isolates from this study demonstrated 
that a total of 60 (30.3%) of the 192 amino acids of the El 
protein were invariant among these isolates (Fig. 3) . Most 
impressive, all 8 cysteine residues as well as 6 of 8 
proline residues were invariant. The most abundant amino 

20 acids (e.g. alanine, valine and leucine) showed a very low 
degree of conservation. The consensus sequence of the El 
protein contained 5 potential N- linked glycosylation sites. 
Three sites at positions 209, 305 and 325 were maintained 
in all 51 HCV isolates. A site at position 196 was 

25 maintained in all isolates except the sole isolate of 

genotype 2c. Also, a site at position 234 was maintained 
in all isolates except one isolate of genotype I/la, all 
four isolates of genotype IV/2b and the sole isolate of 
genotype 6a. Conversely, only genotype IV/2b isolates had 

3Q a potential glycosylation site at position 233. Further 

analysis, revealed a highly, conserved amino acid domain (aa 
302-328)' in the El protein with 20 (74.1%) of 27 amino 
acids invariant among all 51 HCV isolates. It is possible 
that the 5' and 3' ends of this domain are conserved due to 

55 important cysteine residues and N-linked glycosylation 
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sites. The central sequence, 5' -GHRMAWDMM-3 ' (aa 315-323), 
may be conserved due to additional functional constraints 
on the protein structure. Finally, although the amino acid 
sequence surrounding the putative £1 protein cleavage site 
was variable, an amino acid doublet (GV) at position 380 
5 was invariant among all HCV isolates. 

A dendrogram of the genetic relatedness of the El 
protein of selected HCV isolates representing the 12 
genotypes is shown in Fig* 4. This dendrogram was 
constructed using the program CLUSTAL (Higgins, D.G. et al. 
10 (1988) Gene, 73:237-244) and had a limit of 25 sequences. 
The scale showing percent identity was added based upon 
manual calculation. From the 51 HCV isolates for which the 
complete sequence of the El gene region was obtained, 25 
isolates representing the twelve genotypes were selected 
15 for analysis. This dendrogram in combination with the 
analysis of the El gene sequence of 51 HCV isolates in 
Table 1 demonstrates extensive heterogeneity of this 
important gene. 

The worldwide distribution of the 12 genotypes 
20 among 74 HCV isolates is depicted in Fig. 5, The complete 
El gene sequence was determined in 51 of these HCV isolates 
(SEQ ID N0s:l-51), including 8 isolates of genotype I/la, 
17 isolates of genotype Il/lb and 26 isolates comprising 
genotypes III/2a, IV/2b, 2c, 3a, 4a-4d, 5a and 6a. In the 
remaining 23 isolates, all of genotypes I/la and Il/lb, the 
genotype assignment was based on a partial El gene sequence 
since they did not represent additional genotypes in any of 
the 12 countries. The number of isolates of a particular 
genotype is given in each of the 12 countries studied. Of 
the twelve genotypes, genotypes I/la and Il/lb were the 
most common accounting for 48- (65%) of the 74 isolates. 
Analysis of the El gene sequences available in the GenBank 
data base at the time of this study revealed that all 44 
such sequences were of genotypes I/la, Il/lb, III/2a and 
IV/2b. Thus, based upon El gene analysis, 8 new genotypes 
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Also of interest, different HCV genotypes were 
frequently foiind in the same country, with the highest 
number of genotypes (five) being detected in Denmark. Of 
the twelve genotypes, genotypes I/la, Il/lb, III/2a, IV/2b 
5 and V/3a were widely distributed with genotype Il/lb being 
identified in 11 of 12 countries studied (Zaire was the 
only exception) . In addition, while genotypes I/la and 
Il/lb were predominsmt in the Americas, Europe and Asia, 
several new genotypes were predominant in Africa. 

10 It was also found that genotypes I/la, Il/lb, 

III/2a, IV/2b and V/3a of HCV were widely distributed 
around the world, whereas genotypes 2c, 4a, 4b, 4d, 5a and 
6a were identified only in discreet geographical regions. 
For example, the majority of isolates in South Africa 

15 comprised a new genotype (5a) and all isolates in Zaire 
comprised 3 new closely related genotypes {4a, 4b, 4c) . 
These genotypes were not identified outside Africa. 



Ex^^plQ 3 

20 Identification of the cDNA Sequence 

Of The Core Gene Of 52 Isolates Of HCV 

Viral RNA extraction, cDNA synthesis and "nested" 
PCR were carried out as in Example 1. For the cDNA PCR 
assay HCV-specific synthetic oligonucleotides deduced from 

25 previously determined sequences that flank the C gene were 
used. Amplified DNA was purified by gel electrophoresis 
followed by glass -milk extraction as described in Example 1 
or by electroelution and both strands were sequenced 
directly. In 44 of the 52 HCV isolates studied the 

30 procedures for direct sequencing described in Example 1 
were utilized. For a number of the HCV isolates 
confirmatory sequencing was performed with the Applied 
Biosystems 373A automated DNA sequencer and 8 HCV isolates 
of genotype I/la or Il/lb were sequenced exclusively by 

35 this method. All 73 negative control samples interspersed 
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* among the test samples were negative for HCV RNA. 

The amplified DNA fragment obtained in 50 of the 
52 HCV isolates was specifically designed to overlap with 
previously obtained 5'NC sequences (Bukh et al. {1992b) 
Proc, Natl- Acad. Sci. U.S.A. 89:4942-4946) and with the El 
5 sequences disclosed herein at approximately 80 nucleotide 
positions each. A complete match was observed in 6033 of 
6035 overlapping nucleotides. Two discrepancies were 
observed in isolate US6 at nt 552 (C and T) and nt 561 (C 
and T) respectively. This may have been due to 

10 microheterogeneity at these nucleotide positions, since the 
remaining overlapping sequence was unique for isolate US6. 
In addition, there were 3 confirmed instances of 
microheterogeneity: nt 33 in isolate SAll (C,T and T) , nt 
36 in isolate S45 (A,C and A), and nt 552 in isolate PIO 

15 (C,T and T) . Overall, the excellent agreement in these 

overlapping sequences in this study with the NC sequences 
disclosed in Bukh et al. and with the El sequences 
disclosed herein definitively luled out contamination as a 
source of non-authentic HCV sequences. Furthermore, this 

20 analysis proved that the sequences obtained were from a 

single population, and not from different populations as /A 
could happen in mixed infections. 

The core (C) gene was exactly 573 nucleotides in 
length in all 52 HCV isolates with an amino terminal start 

25 codon and no in- frame stop codons. Microheterogeneity was 
observed in 26 of the 52 HCV isolates at 0.2-1.4% of the 
573 nucleotide positions of the C gene, and resulted in 
changes in 0.5-1.0% of the 191 predicted amino acids in 12 
of these isolates. A multiple sequence alignment was 

30 performed and it showed that the nucleotide identities of 
the C gene among these. HCV isolates were in the range of 
79.4-99.0%. In order to compare the genetic relatedness of 
HCV isolates in different gene regions, phylogenetic trees 
of the C gene of all 52 HCV isolates and the El gene of 51 

35 HCV isolates were constructed using the unweighted pair- 
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group method with arithmetic mean (Nei, M. (1987) Molecular . 
Evolutionary Genetics (Columbia University Press, New York, 
N.y., pp. 287-326) (Figure 8). In both dendrograms a 
division of the 45 HCJV isolates from which C and El genes 
had been cloned into at least six major genetic groups 
5 (genotypes 1-6) and 12 minor genetic groups (genotypes 

I/la, Il/lb, III/2a, IV/2b, 2c, V/3a, 4a-4d, 5a, and 6a) 
was observed. It is noteworthy that a major division in 
genetic distance between HCV isolates of genotype 2 and 
those of the other genotypes in the phylogenetic analyses 

10 of both gene sequences was observed. Furthermore, the 
divergence of the minor genotypes within genotype 2 
exhibited a degree of heterogeneity that is equivalent to 
that observed among the major genotypes. Analysis of the C 
gene from isolates 25 and Z8, which had a unique 5' NC 

15 secjuence (Bukh et al. (1992)) but from which the El gene 
could not be amplified, revealed that these isolates 
represented two additional genotypes. The designations 4e 
and 4f are assigned to these genotypes that have not been 
described previously. Overall, the present specification 

20 demonstrates that the genetic relatedness of HCV isolates 
t'\ is equivalent when analyzing the most conserved gene (C) 

and one of the most variable genes (El) of the HCV genome, 
thereby providing strong evidence for the suggested 
division into major and minor genotypes. 

25 

„ Example 4 

Computer Analysis of the Nucleotide and Deduced 
ft^n ^PQ Acid Secruences Of The Core Gene Of 52 HCV Isolates 

In order to study further the heterogeneity of 
30 the C gene, a consensus sequence of the core gene from the 
52 HCV isolates (Fig. 6 J) was obtained. A total of 335 
(58.5%) of the 573 nucleotides of the C gene were invariant 
among these HCV isolates. Nucleotides at the 1st and 2nd 
codon positions were invariant at 70.7% and 81.7% of these 
35 positions, respectively, while nucleotides at the 3rd 



wo 96/05315 



PCT/US95/10398 



- 69 - 

^ position were invariant at only 23.0% of such positions. 

Stretches of 6 or more invariant nucleotides were observed 
from nucleotides 1-8, 22-27, 85-92, 110-125, 131-141, 334- 
340, 364-371, 397-404, and 511-516 and may be suitable for 
anchoring primers for amplification of HCV RNA in cDNA PGR 

5 assays. 

Genotype -specific nucleotide positions of the 
core gene of hepatitis C virus were also noted for each of 
the genotypes. These genotype -specific nucleotides are 
shown below where each genotype- specif ic nucleotide is 
10 given in parentheses next to the nucleotide position in 
which it is found. 

Genotype 1: 460 (C) , 466 (C) , 483 (C) , 486 (G) . 
Genotype I/lat 180 (T) . 

15 

Genotype II /lb; 106 (C) , 273 (G) . 

Genotype 2: 192 (C) , 201 (A) , 203 (A) , 207 (G) , 210 (C) , 
221 (A), 231 (A), 232 (A), 341(A). 

20 

Genotype III/2a: / 315 (C) , 355 (G) . 

Genotype IV/2b: 45 (A), 174 (G) , 216 (C) , 348 (A), 376 (A), 
414 (T) . 

25 

Genotype 2c; 233 (G) , 312 (C) ,_318 (A) , 456 (C) , 462 (G) , 
543 (C) , 556 (T) . 

Qpnotvpe V/3a: 47 (T) , 84 (A) , 106 (G) , 126 (A) , 150 (T) , 
30 212 (G), 216 (A), 300 (A), 491 (T) , 559 (C) , 560 (A), 568 

(G), 571 (AK, - 572 (G) 

Genotype 4; 59 (T) • 



35 



Genotype 4a: 213 (A) , 231 (G) , 415 (A) . 



wo 96/05315 



PCTAJS95/1039S 



- 70 - 

Genotype 4b: 66 (G) , 145 (G) , 310 (A) . 

Genotype 4c; 213 (T) , 219 (A) , 270 (T) . 

genot^YP? 212 (T) , 327 (G) , 469 (C) . 

5 

Genotype 4e: 199 (C) , 306 (A), 326 (A), 

genptYP^ 57 (T) , 75 (A), 267 (A) . 

10 Genotype 5a; 291 (G) , 294 (C) . 

Genotype 6a; 59 (C) , 175 (A) , 195 (A) , 198 (A) , 214 (C) , 
224 (A), 316 (C), 351 (G) , 387 (G) , 444-447 (GGCT) , 450 
(G) , 471-472 (AA) , 474 (C) . 

15 These genotype- specific nucleotides are of 

utility in designing the genotype-specific PGR primers and 
hybridization probes. 

Finally, although the full length nucleic acid 
sequence of the C gene of isolates representing genotypes 

20 I/la, Il/lb, III/2a, IV/2b and V/3a have been reported by 
t\ others, those of 9 of the 1;4^ genotypes (i.e., 2c, 4a-4f, 5a i\ 
and 6a) have not been reported previously. In sum, by 
aligning the consensus sequences of the major genotypes, 
the present application enables those skilled in the art to 

25 map universally conserved sequences as well as genotype- 
specific sequences of the C gene among 14 genotypes .of HCV. 

In order to study the heterogeneity of the 
deduced C protein, a multiple sequence alignment of the 
predicted amino acids for all 52 HCV isolates was 

3Q performed, and a consensus sequence was obtained (Fig. 7 J) . 
The identities of the predicted 191 amino acids of the C 
protein among these HCV isolates were in the range of 85.3- 
100.0%. A total of 132 (69.1%) of the 191 amino acids of 
the C protein were invariant . The most prevalent amino 

« acids in the consensus sequence were glycine (13.6%), 
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** arginine (12.6%), proline (11.0%), and leucine (9.9%). The 
most conserved amino acids were tryptophan (5 of 5 amino 
acids invariant) , aspartic acid (5 of 5 amino acids 
invariant) , proline (19 of 21 amino acids invariant) and 
glycine (23 of 26 amino acids invariant) . Previous 
5 analyses indicated that HCV is evolutionarily related to 
pestiviruses (Miller et al. (1990) Prpq, ^^tl f A^a^T 
U.S,A- 87:2057-2061). In this regard, it is of interest to 
note that the C proteins of both viruses have a high 
content of proline residues (Collette M.S. et al. (1988) 
10 Virology 165:200-208), which are likely to be important in 
maintaining the structure of this protein. As is 
characteristic for a protein that binds to nucleic acid, 
the C protein has conserved amino acids that are basic and 
positively charged, and these are capable of neutralizing 
15 the negative charge of the HCV RNA encapsidated by this 
protein (Rice, CM. et al. (1986) in Togaviridae and 
Flaviviridae, eds Schleinger, S. & Schlensinger , M.J. 
(Plenum Press, New York, N.Y.) pp. 279-326). Specifically, 
over 16% of the amino acids in the consensus sequence of 
20 the C protein of HCV are arginine and lysine that are 
located primarily\ in three clusters (i.e., from amino 
acids 6-23, 39-74 and 101-121) (Shih, CM. et al . (1993) 
Qf*n. Virol. 67:5823-5832) (Fig. 7J) . The 10 arginine and 
lysine residues within amino acids 39-62 are invariant 
25 among all 52 HCV isolates, suggesting that this domain may 
represent an important RNA-binding site. The capsid 
proteins of the related flavi-and pestiviruses (Miller et 
al. (1990)) also have a high content of arginine and lysine 
(Rice et al . (1986); Collette et al. (1988). Although 
30 there are three major hydrophilic regions (i.e., amino 

acids 2-23, 39-74 and 101-121) that are conserved in all 52 
HCV isolates, the remainder of the C protein is 
hydrophobic. Interestingly, one such highly conserved 
hydrophobic domain from aa 24-39 is flanked by proline 
ac residues. The hydrophobic domains are likely to be 
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involved in protein-protein and/or protein-RNA interactions 
during -assembly of the nucleocapsid, as. well as in 
interaction with the lipoprotein envelope, as has been 
suggested for flaviviruses (Rice et al. (1986)). Other 
significant observations are: (i) a cluster of 5 invariant 
5 tryptophan residues from aa 76-107; (ii) the lack of an N- 
linked glycosylation site (N-X-T/S) ; (iii) two potential 
nuclear localization signals (i.e*, PRR6PR at amino acids 
38-43 and PRGRRQP at amino acids 58-64) that are present in 
all 52 HCV isolates (Shih et al. (1993) ); and (iv) a 

10 putative DNA-binding motif SPRG at amino acids 99-102, 

found in 51 of the 52 HCV isolates, with SP present in all 
52 isolates. This study demonstrates that the C protein 
has features that are highly conserved among the various 
genotypes of HCV, and that are known to be characteristic 

15 of capsid proteins of other related viruses. 

It should also be noted that the phylogenetic 
analysis of the amino acid sequence of the C proteins was 
not capable of resolving the minor groups within genotypes 
1 and 4 because of the conservation of this protein (data 

20 not shown) . Indeed, only a few type-specific amino acids 

were identified. One striking example was that isolates of - \ 
genotype 4 have an additional methionine at position 20 
that is specific for this major genetic group. Finally, 
the conservation of the sequences surrounding the cleavage 

25 site between the C and the El proteins of the different 
genotypes, which has been determined to be between amino 
acid 191 (alanine) and aa 192 (tyrosine) in HCV isolates of 
genotype 1 was analyzed (Hijikata, M., et al. (1991) Proc, 
Vatl. Acad, Sci. USA 88:5547-5551). The C-terminal 

30 sequence of C is serine -alanine in all but one of the 48 
HCV isolates comprising genotypes 1, 2, 4, 5 and 6. 
However, all 4 HCIV isolates of genotype 3 in this study, as 
well as isolates of genotype 3 published previously 
(Okamoto, H., et al. (1993) J. Gen. Virol. 74:2385-2390, 

35 Stuyver, L., et al. (1993) Biochem. Biophys. Res. Comm. 
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** 192:635-641), contain alanine -serine at this position. 

Thus, studies wiir be needed to determine the C/Er cleavage 
site in genotype 3 isolates. Overall, the present 
invention application discloses the mapping of universally 
conserved sequences, as well as genotype -specific 

5 sequences, of the C protein among 14 genotypes of HCV. 

IspllcationB of the snapping of universally 
conserved and genotype** specif ic core nucleotide 
and asiino acid core sequences for diagnosis of 
BCV Infection and for determination of HCV 
aenotvpes — 

10 Detection of antibodies directed against the HCV 

core protein is important in the diagnosis of HCV 
infection. The recombinant C22-3 protein, spanning amino 
acids 2-120 of the C gene, is a major component of the 
commercially available second-generation anti-HCV tests. 

15 Several studies have indicated that the three major 
hydrophilic regions of the C protein contain linear 
immunogenic epitopes (summarized in J. Clin. Microbiol, 
30:1989-1994} (Sallberg, M. et al. (1992). For example, 
antibodies against synthetic peptides from amino acids 1- 

20 18, 51-68 and 101-118 were detected in infected patients 
(Sallberg, M. et/al. (1992)). The present application 
demonstrates that, while these immunogenic regions are 
highly conserved, genotype-specific differences are 
observed at several amino acid positions that may influence 

25 the specificity and sensitivity of the serological tests. 
One such example is that a single amino acid substitution 
at amino acid 110 has been demonstrated to affect sero- 
reactivity (SSllberg, et al. (1992)). Despite the high 
degree of conservation in the immunodominant regions of the 

30 C protein among the different genotypes, it is possible 

that genetic heterogeneity of the C protein could lead to 
false negative results in current serological tests. 

With respect to genotype analysis, several 
methods have been used to determine the genotype of HCV 

35 isolates without resorting to sequence analysis. These 
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include PCR followed by: (i) amplification with type- 
specific primers {Okamoto, H. et al. (1992) J. Gen. Virol., 
73:673-679) ; (ii) determination of restriction^iength 
polymorphism (Simmons, P. et al. (1993) J . Gen . Virol . . 
74:661-668); and (iii) specific hybridization (Stuyver, L. 
5 (1993) J. Gen, Virol. . 74:1093-1102). The proposed methods 
have primarily been based on 5' NC and C sequences. 
Previous studies suggested that 5' NC-based genotyping 
systems would only be predictive of the major genetic 
groups of HCV (Bukh, J., et al . (1992) Proc. Natl. Acad. 
10 Sex. USA 89:4942- 4946, Bukh, J., et al . (1993) Proc. 

Natl. Acad. Sci. USA 90:8234-8238). The most widely used C- 
based genotype system has been the PCR assay with type- 
specif ic primers that was designed for distinguishing HCV 
isolates of genotypes I/la, Il/lb, III/2a, IV/2b and V/3a 
15 (Okamoto, H., et al . (1993) *7. Gen. Virol. 74:2385-2390, 

Okamoto, H, et al. (1992) *7. Gen. Virol. 73:673-679). 
Since this system was developed prior to the identification 
of genotypes 2c, 4a-4f , 5a and 6a there are significant 
limitations to this typing system. For example, the 
20 primers specific for genotype IV/2b (nt 270-251) are as 
^ highly conserved within isolates of genotype 4c and 6a as 

within the isolates of genotype IV/2b. Thus, this assay 
probably can not distinguish among these genotypes. Another 
C-based approach involves distinguishing between genotypes 
25 1 and 2 by type-specific antibody responses (Machida et al 
(1992) Hepotolocrv , 16:886-891) . Synthetic peptides 
composed of amino acids 65-81 were found to be genotype - 
specific for genotypes 1 and 2 in ELISA assays. The 
present analysis of amino acid sequences demonstrated 
3Q significant variation within isolates of genotypes 1 emd 2. 

Thus it is likely that these peptides will not identify all 
isolates of genotypes l and 2. Furthermore, the peptide 
for genotype 1 was highly conserved within isolates of 
genotypes 3 and 4 and might detect antibodies against these 
35 genotypes as well. Finally, it should be pointed out that 
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** most isolates of genotypes 3 and 4 had an identical amino 
.acid .sequence at positions 65-81. 

Example 5 

Detection by ELISA Based on Antigen from 
5 Insect Cells Expressing Complete El Or Core Protein 

Expression of El or Core protein i n SF9 cells . A 
cDNA (eg SEQ ID N0:1) encoding a complete El protein (gg 
SEQ ID NO: 52) or a cDNA (eg SEQ ID NO: 103) encoding a 
complete core protein ( e.g. SEQ ID NO: 155) is sxibcloned 
10 into pBlueBac - Transfer vector (Invitrogen) using standard 
subcloning procedures. The resultant recombinant 
expression vector is cotransf ected into SF9 insect cells 
(Invitrogen) by the Ca precipitation method according to 
the Invitrogen protocol, 

E?^ISA P^^^ql Pn JCTiggPt^t^ gF? cell^. 5 x 10** SF9 
cells infected with the above -de scribed recombinant 
expression vector are resuspended in 1 ml of 10 mM Tris- 
HCl, pH 7.5, 0.15M NaCl and are then frozen and thawed 3 
times. 10 ul of this suspension is dissolved in 10 ml of 
carbonate buffer (pH 9,6) and used to cover one flexible 
microtiter assay plate (Falcon) . Serum samples are diluted 
1:20, 1:400 and 1:8000, or 1:100, 1:1000 and 1:10000. 
Blocking and washing solutions for use in the ELISA assay 
are PBS containing 10% fetal calf serum and 0.5% gelatin 
(blocking solution) and PBS with 0.05% Tween -20 (Sigma, 
St. Louis, MO) (washing solution). As a secondary antibody, 
peroxidase -conjugated goat IgG fraction to human IgG or 
horse radish peroxidase-labelled goat anti-Old or anti-New 
World monkey immunoglobulin is used. The results are 
determined by measuring the optical density (O.D.) at 405 
nm. 

To determine if insect cells-derived El or core 
protein representing genotype I/a of HCV could detect smti- 
HCV antibody in chimpanze s infected with genotype I/la of 
HCV, three inf cted chimpanzees are examined. The seinim of 
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^ all 3 chimpanzees are found to seroconvert to anti-HCV. 

Example ^ 

Use of the Complete 
El Protein as a Vaccine 

5 Mammals are immunized with purified or partially 

purified El protein in an amount sufficient to stimulate 
the production of protective antibodies. The immunized 
mammals challenged with various genotypes of HCV are 
protected. 

10 It is understood by one skilled in the art that 

the recombinant El protein used in the above vaccine can 
also be used in combination with other recombinant El 
proteins having an amino acid sequence shown in SEQ ID 
NOs:52-102. In addition, recombinant core proteins having 

15 an amino acid sequence shown in SEQ ID NOs: 155-206 could 
also be used in the above vaccine, either alone, in 
combination with other recombinant core proteins of the 
present invention, or in combination with recombinant El 
proteins having an amino acid sequence shown in SEQ ID 

20 NOs: 52-102. 

Example 7 

Determination of the Genotype of an HCV 
Isolate Via Hybridization of Genotype -Specific 
Oligonucleoti des to RT-PCR Amplification Products. 



25 



30 



35 



.Viral RNA is isolated from serum obtained from a 
mammal and is subjected to RT-PCR as in Example 1 or 
Example 3. Following amplification, the amplified DNA is 
purified as described in Example 1 or Example 3 ctnd 
aliquots of 100 ul of amplification product are applied to 
dots on a. nitrocellulose filter set in a dot blot 
apparatus. The dots are then cut into separate dots and 
each dot is hybridized to a ^^P- labelled oligonucleotide 
specific for a single genotype of HCV. The 
oligonucleotides to be used as hybridization probes are 
deduced from the consensus sequences shown in Figures lA-lH 
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or 6A-6J or from the SEQ ID NOs: representing El or core 
sequences comprising genotypes 4a-4f , 2c and 6a. 

Example B 

ELISA Based on Synthetic 
Peptides Der ived From El cDNA Sequences 

El peptide (s) specific for genotype I/la is 
placed in 0.1% PBS buffer and 50ul of a Img/ml solution of 
peptide is used to cover each well of the microtiter assay 
plate. Serum samples from two mammals infected with 
genotype I/la HCV and from one mammal infected with 
genotype 5a HCV are diluted as in Example 3 and the ELISA 
is carried out as in Example 3. Both mammals infected with 
genotype I HCV react positively with peptides while the 
mammal infected with genotype 5a HCV exhibits no 
reactivity. One skilled in the art would readily 
understand that in the above experiment, core peptides 
specific for genotype I/la could be used in place of, or in 
combination with the El genotype -specific peptide (s). 

Example 9 
Use of El Peptides as a Vaccine , ] 

Since the El genotype-specific peptides of the 
present invention are derived from two variable regions in 
the complete El protein, there exists support for the use 
25 . of these peptides as a vaccine to protect against a variety 
of HCV genotypes. Mammals are immunized with peptide (s) 
selected from SEQ ID NOs: 136-159 in an amount sufficient 
to stimulate production of protective antibodies. The 
immunized mammals challenged with various genotypes of HCV 
30 are protected. One skilled in the art would readily 

understand that genotype-specific core peptides of the 
present invention could also be used either alone, in 
combination with each other, or in combination with the 
genotype-specific El peptides, as a vaccine to protect 
35 against a variety of HCV genotypes. In addition, the above 
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vaccines may also be formulated using the universal core 
and/or El peptides of .the present invention. 
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(iii) NUMBER OF SEQUENCES: 263 

(iv) CORRESPONDENCE ADDRESS: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK7 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



10 



TAC 


CAA 


GTG 


CGC 


AAC 


TCC 


ACG 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATC 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


ACT 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GTC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAT 


GGC 


AAA 


CTC 


CCC 


ACA 


GCG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTC 


GTC 


GGG 


AGT 


GCC 


234 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


GGT 


CAA 


CTG 


TTT 


ACC 


TTC 


TCT 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACG 


ACG 


CAA 


GGC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAT 


CCT 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCG 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACC 


ACG 


GCG 


TTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCG 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


CTG 


CTA 


TTT 


GCC 


GGC 


GTC 


GAC 


GCG 
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^ (2) INFORMATION FOR SEQ ID NO: 2: 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 





(xi) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


N0:2: 




TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


TCG 


GGC 


CTC 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAT 


TCT 


CCA 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AAA 


TGT 


TGG 


GTG 


GCG 


GTG 


GCC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAG 


CTC 


CCC 


GCA ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAT 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTC 


CTT 


GTC 


GGC 


CAA 


CTG 


TTC 


ACC 


TTC 


TCC 


CCC 


312 


AGA 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAC 


TGT 


TCT 


ATC 


351 
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TAC 


CCC 


6GC 


-CAT 


ATT 


AC6 


GGT 


CAT 


CGC 


ATG 


GCG 


TGG GAT 


390 


ATG' 


ATG ATG AAC TGG TCC CCT ACA GCA 


GCG 


CTG GTA ATG 


429 


GCG 


CAG 


CTG 


CTC 


AGG 


ATC 


CCG 


CAG 


GCC 


ATC 


TTG 


GAC ATG 


466 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


GTG GTG 


546 


GTA 


CTG 


TTG 


CTG 


TTT 


ACC 


GGC 


GTC 


GAT 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DRl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



CAC 


CAA 


GTG 


CGC 


AAC 


TCT 


ACA 


GGG 


CTT 


TAC 


CAT 


GTC ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAT 


TCG 


ACT 


ATT 


GTG 


TAC 


GAG 


GCG GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


GCG 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


GTG ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC GGG 


273 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTT 


TCT CCC 


312 


AGG 


CGC 


CAC 


TGG 


a!ca 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCT ATG 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACA 


GCG 


CTG 


GTA ATG 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC ATG 


468 


ATC 


GCT 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


GTG GTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(vi) 
(xi) 



ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DR4 



SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
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o 





CAA 


GTG 






id 




cznn 




Tan 


P2VT 


GTP 






.AAT 


GAT 


■TGC 


CCT 


" Ti Tvm 






ATT 


• \a X\7 ' 


TUP 


GAG ■ 


GCG 


:GCC 


. / O 


GAT 


GCC 


Tk rp<^ 

ATC 












X\7X 


GTf 


W w X 


TGC 


GTT 




















TGG 


GTG 


GCG 


GTG 


ACC 




ccc 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


195 


CA6 


CTC 


CGA 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


TCT 


CCC 


312 


AGG 


CAC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACA 


GCG 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



/A 



25 



TAC 


CAA 


GTG 


CGC 


AAC 


TCC 


ACG 


GGG 


CTT 


TAC 


CAT 


GTT 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


ACA 


GCT 


78 


GAT 


GCT 


ATC 


CTA 


CAC 


GCT 


CCG 


GGA 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGT 


GAG 


GGT 


AAC 


ACC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


GCA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


TAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


GGT 


CAG- 


CTG 


TTT 


AGC 


TTC 


TCT- 


CCC 


312 


AGG 


CGC 


CTC 


TGG 


ACG 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAT 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACG 


GCA 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAT 


ATG 


468 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCG 


AAG 


GTC 


CTA 


GTG 


546 


GTG 


CTG 


CTG 


CTA 


TTC 


GCC 


GGC 


GTT 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 



WO96/0S31S 



PCT/US95/I03M 



10 



83 - 



(C) STRANDEDNESS : single 
-(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6; 



TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


ACG 


GGC 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAC 


TGC 


ccr 


AAC 


TCG 


AGC 


ATT 


GTG 


TAC 


GAG 


ACG 


GCC 


78 


GAT 


ACC 


ATC 


CTA 


CAC 


TCT 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGA 


TGT 


TGG 


GTG 


CCG 


GTG 


GCC 


156 


CCC 


ACA 


GTT 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


GCA ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTT 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAT 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


AGC 


CAG 


CTG 


TTC 


ACT 


ATC 


TCC 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAC 


TGT 


TCT 


ATC 


351 


TAC 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


ACG 


GCG 


TTG 


GTA ATA 


429 


GCT 


CAG 


CTG 


CTC 


AGG 


GTC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA GCG 


507 


TAT 


TTC 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


CTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTC 


GAT 


GCG 
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15 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 576 base pairs 

(B) / \ TYPE: nucleic acid ,n 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
25 (C) INDIVIDUAL ISOLATE: SWl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TAC CAA GTA CGC AAC TCC TCG GGC CTT TAC CAT GTC ACC 39 

AAT GAT TGC CCT AAC TCG AGT ATT GTG TAC GAG ACG GCC 78 

GAT GCC ATT CTA CAC TCT CCA GGG TGT GTC CCT TGC GTT 117 

CGC GAG GAT GGC GCC CCG AAG TGT TGG GTG GCG GTG GCC 156 

30 CCC ACA GTC GCC ACT AGG GAC GGC AAA CTC CCT GCA ACG 195 

CAG CTT CGA CGT CAC ATC GAT CTG CTT GTC GGA AGC GCC 234 

ACC CTC TGC TCG GCC CTC TAC GTG GGG GAC TTG TGC GGG 273 

TCT GTC TTT CTC GTC AGT CAA CTG TTC ACG TTC TCC CCC 312 

AGG CGC CAC TCG ACA ACG CAA GAC TCT AAC TCT TCT ATC 351 

TAT CCC GGC CAC ATA ACG GGT CAC CGC ATC GCA TGG GAT 390 

ATC ATC ATC AAC TCG TCC CCC ACA ACA GCG CTC GTA GTA 429 

GCT CAG CTC CTC AGG ATC CCG CAA GCC GTC TTG GAC ATC 468 

ATC GCT GGT GCC CAC TGG GGA GTC CTA GCG GGC ATA GCG 507 



wo 96/05315 



PCT/US95/10398 



- 84 - 

o 

TAT TTC TCC ATG GTG GGG AAC TGG GCG AAG GTC CTG ATA 546 
GTG CTG TTG CTG TTT TCG GGC GTG GAT GGG 576 



(2) INFORMATION FOR SBQ ID NO: 8: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: homosapiens 

" (C) INDIVIDUAL ISOLATE: USll 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


ACG 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


ACT 


CCG 


GGG 


TGT 


GTT 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCT 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


195 


CAA 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


GGT 


CAA 


CTG 


TTT 


ACC 


TTC 


TCT 


CCC 


312 


AGA 


CGC 


CAC 


TGG 


ACG 


ACG 


CAG 


GGC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


GCG 


GCG 


TTG 


GTG 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


CTG 


CTA 


TTT 


GCC 


GGC 


GTC 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 9: 

25 ( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 
- ~ - 4b) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
30 (C) INDIVIDUAL ISOLATE: Dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAT GTC ACG 39 

AAC GAC TGT TCC AAC TCG AGC ATT GTG TAT GAG ACA GCG 78 

GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT 117 

CGG GAG GAC AAC TCC TCT CGC TGC TGG GTA GCG CTC ACC 156 

CCC ACG CTC GCG GCT AGG AAT GGC AAC GTC CCC ACT ACG 195 



^096/05315 



PCT/US95/10398 



GCG ATA CGA C6C CAC 
OCT TTC TGC TGC GGC 
TCT GTT TTC CTC ATC 
CGC CGG CAT GAG ACG 
TAT CCC GGC CAC GTG 
ATG ATG ATG AAC TGG 
5 TCG CAG TTA CTC CGG 
GTG GCG GGG GCC CAC 
TAC TAT TCC ATG GTG 
GTG ATG CTA CTC TTT 



- 85 - 



GTC GAT TTG CTC GTT 
ATG TAC GTG GGG GAT- 
TCC CAG CTG TTC ACC 
GTA CAG GAG TGT AAT 
ACA GGT CAC CGT ATG 
TCA CCT ACA ACA GCC 
ATC CCA CAA GCT GTC 
TGG GGG GTC CTG GCG 
GGG AAC TGG GCT AAG 
GCT GGC GTT GAC GGC 



GGG 


GCG OCT 


234 


CTC 


TGC-GGA 


273 


CTC 


TCG CCT 


312 


TGC 


TCA ATC 


351 


GCT 


TGG GAT 


390 


TTA 


GTG GTA 


429 


ATG 


GAC ATG 


468 


GGC 


CTC GCC 


507 


GTT 


TTG ATT 


546 
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(2) INFORMATION FOR SEQ ID NO: 10: 

^® (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: D3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



20 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TGG 


GGG 


GTG 


TAC 


CAA 


GTG 


ACG 


39 


AAT 


GAC 


TGT 


TCC 


AAC 


TCG 


AGG 


ATG 


GTG 


TAT 


GAG 


ACA 


GGG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACG 


CGG 


GGG 


TGC 


GTG 


CCC 


TGG 


GTT 


117 


CGG 


GAG 


GAC 


AAC 


TCC 


TCT 


CGG 


TGG 


TGG 


GTA 


GCG 


CTC 


ACG 


156 


CCC 


ACG 


CTC 


GCG 


GGT 


AGG 


AAT 


AGG 


AGG 


GTC 


GCG 


ACT 


AGGf^ 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTG 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCG 


ATG 


TAG 


GTG 


GGG 


GAT 


GTT 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TGG 


CAG 


GTG 


TTC 


ACG 


TTG 


TCG 


GGT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAA 


TGT 


AAC 


TGG 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTG 


ACA 


GGT 


GAC 


GGC 


ATG 


GGT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


GGT 


ACA 


GGA 


GCG 


GTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATG 


CCA 


CAA 


GCT 


GTC 


GTG 


GAG 


ATG 


468 


GTG 


GCG 


GGG 


GCC- 


GAG 


TGG 


GGG 


GTG 


-GTG 


GGG 


GGG 


CTC 


GGC 


507- 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GGT 


GGG 


GTG 


GAC 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 11: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINTkL SOURCE: 

(A) ORGANISM: homosapiens 



wo 96/05315 



PCTAJS95/10398 



. 86 - 

(C) INDIVIDUAL ISOLATE: DKl 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAC GTC ACA 39 

AAC GAC TGC TCC AAC TCA AGC ATC GTG TAT GAG GCA GTG 78 

5 GAC GTG ATC ATG CAT ACC CCA GGG TGC GTG CCC TGC GTT 117 

CGG GAG AAC AAC CAC TCC CGT TGC TGG GTA GCG CTC ACC 156 

CCC ACG CTC GCG GCC AGG AAC GCC AGC ATC CCC ACT ACG 195 

ACA ATA CGA CGC CAT GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCT ATG TAC GTG GGG GAC CTC TGC GGA 273 

TCC GTT TTC CTC GTC TCT CAG CTG TTC ACC TTT TCA CCT 312 

CGC CGG CAT GAG ACA GCA CAG GAC TGC AAC TGC TCA ATC 351 

TAT CCC GGC CAC GTT TCA GGT CAC CGC ATG GCT TGG GAT 390 

" ATG ATG ATG AAC TGG TCA CCT ACA ACA GCC CTA GTG CTA 429 

TCG CAG TTA CTC CGA ATC CCA CAA GCT GTC GTG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTC GCC 507 

TAC TAC TCC ATG GCG GGG AAC TGG GCC AAG GTT TTA ATT 546 

GTG TTG CTA CTC TTT GCC GGC GTT GAT GGG 576 



15 (2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

. (A) ORGANISM: . hoxnosapiens 

(C) INDIVIDUAL ISOLATE: HK3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


GTC 


GTG 


TAT 


GAG 


ACA 


GCA 


78 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


GTG 


CCC 


TGC 


GTA 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGT 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GTC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCC 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


CTC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAA 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


466 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 








576 



35 



wo 96/05315 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) "SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 



10 



15 



20 





(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 13: 




CAT 


GAA 


GTG 


CAC 


AAC 


GTA 


TCC 


GGG 


ATC 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTC 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


ATC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAT 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGA 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTC 


TTC 


CTC 


GTC 


TCC 


CAG 


TTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGA 


CTC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


GCG 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCT 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCC 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 



30 



35 





(Xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:14: 




TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TTA 


AGC 


ATC 


GTG 


TAC 


GAG 


ACA 


ACG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAA 


AAC 


AAC 


TCC 


TCC 


CGT 


TGT 


TGG 


GTA 


GCG 


CTC 


GCC 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


GCA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTT 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 



W0 96A>5315 



PCr/US95/10398 
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TAT CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


-ATG AAC 


TGG 


TCA 


:CCT 


ACA ACA 


GCC 


.CTA 


GTG 


:GTG 


429 


TCG 


GAG 


TTA 


CTC 


CGG 


ATG 


CCG 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTA 


GC6 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAG 


TAT 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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5 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS : single 
" (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



20 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATC 


GTG 


TAT 


GAA 


ACA 


GCG 


78 


GAC 


ATG 


ATT 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


ATG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


TGC 


TGG 


GTG 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


GTC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTT 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


ATC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGC 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: IND5 

2j (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



wo 96/05315 



PCT/US95/10398 



. 89 - 

o 



TAT -GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACT 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TCC 


TCT 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACT 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


TCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CAC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTA 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCC 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAG 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



25 



TAT 


GAG 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TTC 


TCT 


AGT 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACT 


CTC 


GCG 


GCT 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


7UVC 


TGG 


TCA 


CCT 


ACA 


GCG 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 



wo 96/05315 
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(C) STRANDEDNESS.: single 
..(D) .TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: PIO 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



10 



15 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATA 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGT 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACA 


CTC 


GCG 


GCT 


AGG 


AAT 


TCC 


AGC 


GTC 


CCA ACT 


ACG 


195 


GCA 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


CTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCT 


312 


CGC 


CGG 


CAT 


TGG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGT 


TCA 


ATC 


351 


TAT 


CCT 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


GCA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


CTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


ATC 


TTG 


GAT 


GTG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AA6 


GTC 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGA 
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(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
25 (C) INDIVIDUAL ISOLATE: S9 

(xi) SEQUENCE DESCRIPTION:- SEQ ID NO: 19 



30 



35 



TAT 


GAA 


GTG 


CGC 


AAC 


GTA 


TCC 


GGG 


GCG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAC 


GAG 


GCA 


GCG 


78 


GAC 


GTG 


ATC 


ATG 


CAT 


ACC 


CCC 


GGG 


TGT 


GTA 


CCC 


TGC 


GTT 


117 


CAG 


GAG 


GGT 


AAC 


TCC 


TCC 


CAA 


TGC 


TGG 


GTG 


GCG 


CTC 


ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCT 


ACC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GTT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


ATC 


TCG 


CCC 


312 


CGT 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


AAC 


TGC 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGA 


CAC 


GTG 


ACA 


GGT 


CAT 


CGC 


ATG 


GCC 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


ACA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


CTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTC 


GCC 


507 
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TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT ' 546 
GTG ATG GTA CTT-TTT GGT GGT GTT GAC GGG 576 



(2) INFORMATION FOR SEQ ID NO: 20: 

S (1} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
° (C) INDIVIDUAL ISOLATE: S45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



15 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GCG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


GCA 


GTG 


78 


GAC 


GTG 


ATC 


CTG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


TGC 


TGG 


GTG 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


6CG 


GCC 


AGG 


AAC 


TCC 


AGC 


GTC 


CCC 


ACT ACG 


195 


ACA 


ATA 


C6A 


CGT 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTT 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGT 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGT 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


GCA 


GCC 


TTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG^ 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GC6 


GGG 


GCC 


CACl 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC ,'-\ 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


CTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L^KSTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SAIO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TAT GAA GTG CGC AAC GTG TCC GGG ATG TAC CAT GTC ACG 39 

AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA GCG 78 

GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT 117 

CGG GAG AAC AAC TCC TCC CGC TGC TGG GTA GCG CTC ACT 156 
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ccc 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


TCC 


A6C 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


C6A 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


TAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


G6C 


CGC 


GTA 


ACA GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA ACA 


GCT 


CTA 


GTA 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA GCT 


ATC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


6CC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


6TT 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 








576 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosaplens 
(C) INDIVIDUAL ISOLATE: SW2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAT 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


A6C 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GCC 


AAC 


TCC 


TCC 


CGC 


TCt 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTA 


GCA 


GCC 


AGG 


AAC 


ACC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


C6A 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GTT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTT 


TCA 


CCT 


312 


CGC 


CGG 


CAC 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTG 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTA 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCA 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCT 


GGC 


GTT 


GAC 


GGG 
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30 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D> TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 
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PCr/US95/10398 



- 93 - 



(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 



TAC 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


TAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AGC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTT 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACT 


AAG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


etc 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGT 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACG 


GCA 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


CTG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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15 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(vi) ORIGINAL SOURCE: f\ 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: TIO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TTT 


GAG 


GCA 


GCG 


78 


GAC 


TTG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


ACC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACG 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAT 


GTG 


GGA 


GAC 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCT 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACT 


TTG 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


CTG 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


ACA 


GCT 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


ACA 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTA 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 








576 
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;(2) INFORMATION FOR SEQ ID NO: 25; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



10 



15 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


GCA GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACT 


CCC 


GGG 


TGC 


GTG 


CCC 


TGT 


GTT 


117 


CGG 


GAG 


AAC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCT 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


ACT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGG 


273 


TCC 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGT 


CAG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGT 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


CTG 


ATT 


546 


GTG 


TTG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 



30 



35 





(xi) 


SEQUENCE DESCRIPTION; 


: SEQ ID 


NO:26: 




GCC 


CAA 


GTG 


AGG 


AAC ACC 


AGC 


CGC 


GGT 


TAC 


ATG 


GTG 


ACT 


39 


AAC 


GAC 


TGT 


TCC 


AAT GAG 


AGC 


ATC 


ACC 


TGG 


CAG 


CTC 


CAA 


78 


GCC 


GCG 


GTT 


CTC 


CAC GTC 


CCC 


GGG 


TGT 


ATC 


CCG 


TGT 


GAG 


117 


AGG 


CTG 


GGA 


AAT 


ACA TCC 


CGA 


TGC 


TGG 


ATA 


CCG 


GTC 


ACA 


156 


CCA 


AAC 


GTG 


GCC 


GTG CGG 


CAG 


CCC 


GGC 


GCT 


CTT 


ACG 


CAG 


195- 


GGC 


TTG 


CGG 


ACG 


CAC ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCT 


GCC CTC 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGC 


273 


GGG 


GTG 


ATG 


CTC 


GCA GCC 


CAG 


ATG 


TTC 


ATT GTC 


TCG 


CCG 


312 
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CGA 


CGG 


GAG 


TGG 


TTT 


GTG 


GAA 


GAA 


TGG 


JUVT 


TGG 


TGG 


ATG 


351 


TAG 


CCC 


GGT 


ACG 


ATG- 


ACT 


GGA 


GAG 


GGT 


:aTG 


:GGA: TGG 


:GAG 


390 


ATG 


ATG 


ATG 


AAG 


TGG 


TGG 


GGG 


AGA 


GGG 


AGG 


ATG 


ATG 


GTG 


429 


GCG 


TAG 


GGG 


ATG 


GGG 


GTT 


GGG 


GAG 


GTG 


ATG 


ATA 


GAG 


ATG 


466 


ATC 


GGC 


GGG 


GGT 


GAG 


TGG 


GGG 


GTG 


ATG 


TTT 


GGG 


TTG 


GCG 


507 


TAG 


TTC 


TGT 


ATG 


GAG 


GGA 


GGG 


TGG 


GGG 


AAG 


GTG 


ATT 


GTG 


546 


ATC 


CTG 


TTG 


GTG 


GGT 


GGT 


GGG 


GTG 


GAG 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

{ D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(G) INDIVIDUAL ISOLATE: T4 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



20 



GGA 


CAA 


GTG 


AAG 


AAG 


ACG 


ACT 


AAC 


AGG 


TAG 


ATG 


GTG 


ACG 


39 


AAC 


GAG 


TGT 


TGT 


AAT 


GAG 


AGG 


ATC 


ACT 


TGG 


CAG 


CTG 


CAG 


78 


GCG 


GCG 


GTG 


CTG 


GAG 


GTG 


CCC 


GGG 


TGT 


GTC 


GCG 


TGG 


GAG 


117 


AAA 


ACG 


GGA 


AAT 


AGA 


TCT 


CGG 


TGG 


TGG 


ATA 


GCG 


GTT 


TGA 


156 


CCA 


AAC 


GTG 


GGG 


GTG 


CGG 


CAG 


CCC 


GGG 


GGC 


CTG 


AGG 


CAG 


195 


GGC 


TTG 


GGG 


ACG 


CAG 


ATT 


GAG 


ATG 


GTT 


GTG 


ATG 


TGG 


GGC 


234 


AGG 


CTG 


TGG 


TGT 


GGT 


GTT 


TAG 


GTG 


GGG 


GAG 


GTC 


TGG 


GGC 


273 


GGG 


GTG 


ATG 


CTG 


GGA 


GCG 


GAG 


ATG 


TTC 


ATC 


GTC 


TGG 


CCG^ 


312 


CAA 


CAT 


CAG 


TGG 


TTT 


GTG 


CAA 


GAG 


TGG 


AAT 


TGG 


TCT 


ATC 


351 


TAG 


GGT 


GGG 


ACG 


ATC 


ACT 


GGA 


GAG 


GGT 


ATG 


GGA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TGG 


CCC 


ACG 


GGC 


ACG 


ATG 


ATC 


GTG 


429 


GCG 


TAG 


GCG 


ATG 


GGC 


GTT 


GGG 


GAG 


GTG 


ATC 


TTA 


GAG 


ATG 


468 


GTT 


AGG 


GGG 


GGA 


GAG 


TGG 


GGC 


GTG 


ATG 


TTC 


GGC 


TTG 


GCG 


507 


TAG 


TTC 


TGT 


ATG 


CAG 


GGA 


GGG 


TGG 


GGG 


AAA 


GTG 


GTT 


GTG 


546 


ATC 


GTT 


CTG 


CTG 


GCG 


GGT 


GGG 


GTG 


GAG 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 
30 (B) TYPE: nucleic acid 

(G) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(G) INDIVIDUAL ISOLATE: T9 

35 SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



wo 96/05315 



PCTAJS95/10398 
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o 





Vaiwi 


GTG 


AAG 






AGT 


ACC 


AGC 


TAC 


ATG 


GTG 


ACA 


39 






TGT 


TCC 


AAC 


6AC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


CAG 


78 


GCC 


GCG 


GTC 


CTC 


CAC 


GTC 


CCC GGG 


TGC 


GTC 


CCG 


TGC 


GAG 


117 


AGA 


GTT 


GGA AAC 


GOG 


TCG 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCG 


156 


CCA 


AAC 


GTA 


GCT 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


AC6 


CTC 


TGC 


TCC 


GCT 


CTC 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGC 


273 


GGG 


GTA 


ATG 


CTC 


GCC 


GCT 


CAG 


ATG 


TTC 


ATT 


ATC 


TCG 


CCG 


312 


CAG 


CAC 


CAC 


TGG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATT 


351 


TAC 


CCT 


GGT 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC. 


-TGG 


TCG 


CCC 


ACA 


ACC 


ACC 


ATG 


ATC 


TTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATC 


AGC 


GGA 


GCT 


CAC 


TGG 


GGC 


GTC 


ATG 


TTC 


GGC 


CTA 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAG 


GTC 


GTT 


GTC 


546 


ATC 


CTG 


TTG 


CTC 


ACC 


GCT 


GGC 


GTG 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

IS (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
2^ (C) INDIVIDUAL ISOLATE: USIO 

... ^1 



25 



30 





(Xi) 


SEQUENCE 


DESCRIPTION! 


: SEQ ID 


NO: 29: 




GTC 


CAA 


GTG 


AAA 


AAC 


ACC 


AGT 


ACC 


AGC 


TAT 


ATG 


GTG 


ACC 


39 


AAT 


GAC 


TGC 


TCC 


AAC 


GAC 


AGC 


ATC 


ACT 


TGG 


CAA 


CTT 


GAG 


78 


GCT 


GCG 


GTC 


CTC 


CAC 


GTT 


CCC 


GGG 


TGT 


GTC 


CCG 


TGC 


GAG 


117 


AAA 


GTG 


GGA 


AAT 


ACA 


TCT 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCA 


156 


CCA AAT 


GTG 


GCC 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACT 


CAC 


ATC 


GAC 


ATG 


GTC 


GTG 


ATG, 


TCC 


GCC . 


234 


ACG 


CTC 


TGC 


TCC 


GCT 


CTT 


TAC 


GTG 


GGG 


GAC 


TTC 


TGC 


GGT 


273 


GGG 


ATG 


ATG 


CTC 


GCA 


GCC 


CAA 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 


312 


CGC 


CAC 


CAC 


TCG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATC 


351 


TAC 


CCC 


GGT 


ACC 


ATC 


ACC 


GGG 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACT 


TTG 


ATC 


CTG 


429 


GCG 


TAC 


GTG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATT 


AGC 


GGG 


GCG 


CAT 


TGG 


GGC 


GTC 


TTG 


TTC 


GGC 


TTA 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


546 


ATC 


CTT 


CTG 


CTA 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

5 (C) INDIVIDUAL ISOLATE: DK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:30: 

GTG GAA GTC AGG AAC ATC AST TCC AGC TAC TAC GCC ACC 39 

AAT GAT TCC TCA AAC AAC AGC ATC ACC TGG CAA CTC ACC 78 

GAC GCA GTT CTC CAC CTT CCC GGA TCC GTC CCA TCT GAG 117 

AAT GAC AAT GGC ACC CTC CGC TCC TGG ATA CAA GTC ACA 156 

CCT AAT GTC GCT GTC AAA CAC CGC GGC GCA CTT ACT CAT 195 

AAC CTC CGA ACA CAC GTC GAC GTC ATC GTA ATC GCA GCT 234 

ACG GTC TCC TCG GCC TTC TAT GTC GGA GAC GTA TCC GGG 273 

GCC GTC ATC ATC GTC TCG CA6 GCT CTC ATA ATA TCG CCT 312 

GAA CGC CAC AAC TTT ACC CAG GAG TCC AAC TCT TCC ATC 351 

TAC CAA GGT CAT ATC ACC GGC CAC CGC ATC GCA TGG GAC 390 

ATC ATC CTA AAC TGG TCA CCA ACT CTT ACC ATC ATC CTC 429 

15 GCC TAT GCC GCT CGT GTT CCT GAG CTA GCC CTC CAG GTT 468 

GTC TTC GGC GGC CAT TGG GGC GTC GTC TTT GGC TTC GCC 507 

TAT TTC TCC ATC CAG GGA GCG TGG GCC AAA GTC ATT GCC 546 

ATC CTC CTT CTT GTC GCA GGA GTC GAT GCA 576 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 576 base pairs 

(B) /""TYPE: nucleic acid > ' 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
25 (C) INDIVIDUAL ISOLATE: DKll 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GTC GAA GTC AGG AAC ACC AGT TCT AGT TAC TAC GCC ACC 39 

AAT GAT TCC TCA AAC AAC AGC ATC ACC TGG CAA CTC ACC 78 

AAC GCA GTT CTC CAC CTT CCC GGA TCC GTC CCA TCT GAG 117 

AAT GAC AAT GGC ACC CTG CAC TCC TGG ATA CAA GTC ACA 156 

30 CCT AAT GTC GCT GTC AAA CAC CGC GGC GCA CTC ACT CAC 195 

AAC CTG CGA GCA CAT ATA GAT ATC ATT GTA ATC GCA GCT 234 

ACG GTC TCC TCG GCC TTC TAT GTC GGA GAC GTG TCC GGG 273 

GCC GTG ATC ATC GTC TCG CAG GCT TTC ATA GTA TCG CCA 312 

GAA CAC CAC CAC TTT ACC CAA GAG TCC AAC TCT TCC ATC 351 

TAC CAA GGT CAC ATC ACC GGC CAC CGC ATC GCA TGG GAC 390 

ATC ATC CTT AAC TGG TCA CCA ACT CTC ACC ATC ATC CTC 429 

GCC TAT GCC GCC CGT GTT CCT GAG CTA GTC CTT GAA GTC 468 

GTC TTC GGT GGT CAT TGG GGT GTC GTC TTT GGC TTC GCC 507 



wo 96/05315 



PCrAJS9S/10398 



- 98 - 



TAT TTC TCC ATG CAG GGA GCG TGG GCC AAG GTC ATT GCC 546 
ATC CTC CTT CTT GTA 6CA GGA GTG GAT GCA 576 



(2) INFORMATION FOR SEQ ID NO: 32: 

5 (i) SEQUENCE CmRACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
" (C) INDIVIDUAL ISOLATE: SW3 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



15 



GTG 


GAA 


GTC 


AGG 


AAC 


ATC 


AGT 


TCT 


AGC 


TAC 


TAT 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCA 


AAC 


A6C 


A6C 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


78 


AAC 


GCA 


GTC 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCG 


TGT 


GAG 


117 


AAT 


GAT 


AAT 


GGC 


ACC 


CTG 


CAC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCG 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


GCA 


CAC 


GTC 


GAT 


ATG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


ATG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


TTC 


ATA 


ATA 


TCG 


CCA 


312 


GAA 


CGC 


CAC 


AAC 


TTT 


ACC 


CAA 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CGT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAC 


390 


ATG 


ATG 


CTA 


AAC 


TGG 


TCA 


CCA 


ACT 


CTT 


ACC 


ATG 


ATC 


CTT 


429 


GCC 


TAT 


GCC 


GCT 


CGT 


GTT 


CCT 


GAG 


CTA 


GTC 


CTT 


GAA 


GTT 


468 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 1^ 


TAT 


TTC 


TCC 


ATG 


CAA 


GGA 


GCG 


TGG 


GCC 


AAG 


GTC 


ATT 


GCC 


546 


ATC 


CTC 


CTG 


CTT 


GTC 


GCA 


GGA 


GTG 


GAT 


GCA 
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(2) INFORMATION FOR SEQ ID NO: 33: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: „ 

GTG GAA GTT AGA AAC ACC AGT TTT AGC TAC TAC GCC ACC 39- 

AAT GAT TCC TCG AAC AAC AGC ATC ACC TGG CAG CTC ACC 78 

,r AAC GCA GTT CTC CAC CTT CCC GGA TCC GTC CCA TCT GAG 117 

AAT GAC AAT GGC ACC TTC CGC TCC TGG ATA CAA GTA ACA 156 
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CCT 


AAT GTG 


GCT 


GTG 


AAA 


CAC 


CGT 


GGC 


GCA 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


ACG 


CAT 


GTC 


GAC 


GTG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGG 


GAC 


GTG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATA 


GCG 


TCG 


CAG 


GCT 


TTC 


ATA 


ATA 


TCG 


CCA 


312 


6AA 


CGC 


CAC 


AAC 


TTC 


ACC 


CAG 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAG 


CAA 


GGT 


CAT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


CTG 


AAC 


TGG 


TCA 


CCA 


ACT 


CTC 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAC 


GCT 


GCT 


CGT 


GTG 


CCT 


GAA 


CTA 


GTC 


CTT 


GAA 


GTT 


468 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAA 


GGA 


GCG 


TGG 


GCC 


AAA 


GTC 


ATC 


GCC 


546 


ATC 


CTC 


CTC 


CTT 


GTC 


GCA 


GGA 


GTG 


GAC 


GCA 
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jQ (2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 



25 





(xi) 


SEQUENCE 


DESCRIPTION: 


; SEQ ID 


N0:34: 




GTG 


GAG 


GTC 


AAG 


GAC 


ACC 


GGC 


GAC 


TCC 


TAC 


ATG 


CCG 


ACC 


39 


AAC 


GAT 


TGC 


TCC 


AAC 


TCT 


AGT 


ATC 


GTT 


TGG 


CAG 


CTT 


GAA 


78 


GGA 


GCA 


GTG 


CTT 


CAT 


ACT 


CCT 


GGA 


TGC 


GTC 


CCT 


TGT 


GAG ^ 


117 


CGT 


ACC 


GCC 


AAC 


GTG 


TCT 


CGA 


TGT 


TGG 


GTG 


CCG 


GTT 


GCO\ 


156 


CCC 


AAT 


CTC 


GCC 


ATA 


AGT 


CAA 


CCT 


GGC 


GCT 


CTC 


ACT 


AAG 


195 


GGC 


CTG 


CGA 


GCA 


CAC 


ATC 


GAT 


ATC 


ATC 


GTG 


ATG 


TCT 


GCT 


234 


ACG 


GTC 


TGT 


TCT 


GCC 


CTT 


TAT 


GTG 


GGG 


GAC 


GTG 


TGT 


GGC 


273 


GCG 


CTG 


ATG 


CTG 


GCC 


GCT 


CAG 


GTC 


GTC 


GTC 


GTG 


TCG 


CCA 


312 


CAA 


CAC 


CAT 


ACG 


TTT 


GTC 


CAG 


GAA 


TGC 


AAC 


TGT 


TCC 


ATA 


351 


TAC 


CCG 


GGC 


CGC 


ATT 


ACG 


GGA 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACT 


ACC 


ACC 


ATG 


CTC 


CTG 


429 


GCG 


TAC 


TTG 


GTG 


CGC 


ATC 


CCG 


GAA 


GTC 


ATC 


TTG 


GAT 


ATT 


468 


GTT 


ACA 


GGA 


GGT 


CAT 


TGG 


GGT 


GTA 


ATG 


TTT 


GGC 


CTC 


GCT 


507 


TAC 


TTC 


TCC 


ATG 


CAG 


GGA 


TCG 


TGG 


GCG 


AAG 


GTC 


ATC 


GTT 


546 


ATC 


CTC 


CTG 


CTG 


ACT 


GCT 


GGG 


GTG 


GAG 


GCG 
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30 (2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 



wo 96/05315 



PCT/US95/10398 



- 100 - 

o 

(A) -ORGANISM: homosapiens 
(C) INDIVIDUAL ISOIiATE: DK12 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



TTA 


GAG 


TGG 


CGG 


AAT 


GTG 


TCC 


GGC 


CTC 


TAC 


GTC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATC 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCT 


ACG 


TGC 


TGG 


ACC 


TCA 


GTG 


ACG 


156 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTG 


CTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGT 


GAT 


GTG 


TGT 


GGG 


273 


GCC 


GTC 


TTC 


CTT 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA 


ACA 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGA 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTA 


429 


GCG 


CAC 


GTC 


CTG 


CGT 


CTG 


CCC 


CAG 


ACC 


TTG 


TTC 


GAC 


ATA 


468 


ATA 


GCT 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


ATG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAC 


TCC 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


546 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGA 


GTC 


GAT 


GCC 








576 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HKIO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



25 



CTA 


GAG 


TGG 


CGG 


AAT 


GTG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


CCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


TCG 


GTG 


ACA 


156 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCC 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTG 


TTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGC 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTC 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCG 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAC 


CTT 


TCA 


GGA 


CAT 


CGA 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCC 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


GTC 


CTG 


CGG 


TTG 


CCC 


CAG 


ACC 


TTG 


TTC 


GAC 


ATA 


468 


ATA 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCA 


GGC 


CTA 


GCC 


507 


TAT 


TAC 


TCC 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


54& 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAT 


GCC 
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35 



wo 96/05315 



PCTAJS9S/10398 



10 



15 



20 



30 



101 - 



:{ 2 ) INFORMATION : FOR " SEQ H> NO : 3 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 





(xi) 


SEQUENCE 


DESCRIPTION: 


; SEQ ID 


N0:37: 


CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTC 


ACC 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


GAC 


GTT 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


CAG 


GAC 


GGT 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAT 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTG 


GTG 


GGC 


GCG 


GCC 


ACT 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGC 


ATG 


GCT 


TGG 


GAT 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


GCG 


CAC 


GTT 


CTG 


CGT 


TTG 


CCC 


CAG 


ACC 


GTG 


TTC 


GAC 


ATA 


ATA 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


TAT 


TAC 


TCC 


ATG 


CAA 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAC 


GCC 









(2) INFORMATION FOR SEQ ID NO: 38: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 



39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
576 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC CTT ACC 39 

AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT 78 

GAC GTC ATT CTG CAC ACA CCC GGC TGT GTA CCT TGT GTT 117 

CAG GAC GGC AAT ACA TCC ATG TGC TGG ACC CCA GTG ACA 156 

CCT ACG GTG GCA GTC AGG TAC GTC GGA GCA ACC ACC GCT 195 

TCG ATA CGC AGT CAT GTG GAC CTA TTA GTG GGC GCG GCC 234 

ACG CTG TGC TCT GCG CTC TAT GTG GGT GAT ATG TGT GGG 273 

GCC GTC TTT CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT 312 
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CGT 


CGC CAT CAA ACG GTC 


CAG. ACC 


.TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAG 


CCA 


GGC 


CAT 


GTT 


TCA 


GGA CAT 


CGA 


ATG GCT TGG GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


42. 


GCG 


CAC 


ATC 


CTG 


CGA 


TTG 


CCC CAG 


ACC 


TTG 


TTT 


GAC 


ATA 


468 


CTG 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAT 


TCT 


ATG 


CAG 


GGC 


AAC TGG 


GCC 


AAG 


GTC 


GCT 


ATT 


546 


GTC 


ATG 


ATT 


ATG 


TTT 


TCA 


GGG GTC 


GAT 


GCC 








576 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S54 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


ATC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


A6C 


AGT ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCC 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


156 


CCT 


ACG 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


CTG 


TGC 


TCT 


GCG 


CTC 


TAT GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA' 


'GCC 


TTC 


ACG 


TTC 


ASA 


CCT 


312 / 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGA 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


ATC 


CTG 


CGA 


TTG 


CCC 


CAG 


ACC 


TTG 


TTT 


GAC 


ATA 


468 


CTG 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAT 


TCT 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT ATC 


546 


ATC 


ATG 


ATT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAT 


GCC 








576 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 
30 (B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



wo 96/05315 



PCT/US95/10398 



- 103 - 

o 



25 



GAG 


GAG 


TAG 


CGG 


AAT 


GGT 


TCG 


GGG 


ATG 


TAT 


CAC 


ATG 


ACC 


39 


AAT 


GAT 


TGT 


CGG 


AAT 


TCG 


AGT 


ATA 


GTG 


TAT 


GAA 


GGT 


GAC 


78 


CAT 


GAG 


ATG 


GTA 


CAC 


TTG 


CGG 


GGG 


TGG 


GTA 


CGG 


TGT 


GTG 


117 


ATG 


ACT 


GGG 


AAC 


ACA 


TCG 


GGT 


TGG 


TGG 


AGG 


CGG 


GTG 


AGG 


156 


CCT 


ACA 


GTG 


GGT 


GTG 


GGA 


CAC 


CGG 


GGG 


GGT 


GGG 


GTT 


GAG 


195 


TCG 


TTG 


GGG 


GGA 


CAT 


GTG 


GAC 


TTA 


ATG 


GTA 


GGC 


GCG 


GCC 


234 


ACT 


TTG 


TGT 


TGT 


GCC 


GTG 


TAT 


GTT 


GGG 


GAC 


GTG 


TGG 


GGA 


273 


GGT 


GGG 


TTG 


GTG 


ATG 


GGG 


CAG 


ATG 


ATG 


AGT 


TTT 


GGG 


CGG 


312 


CX5T 


GGG 


GAG 


TGG 


ACC 


AGG 


GAG 


GAG 


TGG 


AAT 


TGT 


TCG 


ATG 


351 


TAG 


ACT 


GGG 


GAT ATG 


ACC 


GGG 


CAC 


AGG 


ATG 


GGG 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGG 


CCT 


ACC 


ACC 


ACT 


GTG 


GTG 


CTC 


429 


GCC 


GAG 


ATG 


ATG 


AGG 


GTG 


CGG 


ACA 


GCC 


TTT 


GTG 


GAC 


ATS 


468 


GTT 


GGG 


GGA 


GGG 


CAC 


TGG 


GGG 


GTG 


GTG 


GGG 


GGG 


TTG 


GCG 


507 


TAG 


TTG 


AGG 


ATG 


GAA 


GGG 


AAT 


TGG 


GCC 


AAG 


GTA 


GTG 


CTG 


546 


GTG 


GTT 


TTG 


GTG 


TTT 


GGT 


GGG 


GTA 


GAG 


GCC 








576 


(2) 


INFORMATION FOR SEQ ID 


N0:41: 














(i) 




SEQUENCE 


CHARACTERISTICS ; 


















(A) 


LENGTH: 


576 base pairs 
















(B) 


TYPE: nucleic acid 


















(C) 


STRANDEDNESS : single 
















(D) 


TOPOLOGY: linear 












(vi) 


ORIGINAL 


SOURCE: 






















(A) 


ORGANISM: homosapiens 
















(G) 


INDIVIDUAL ISOLATE: 


Zl 










(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 41:"^ 




GTG 


GAG 


TAG 


GGG 


AAT 


GGT 


TCG 


GGG 


GTG 


TAT 


CAT 


GTG 


ACC 


39 


AAT 


GAT 


TGG 


CCT 


AAC 


AGG 


AGG 


ATA 


GTG 


TAG 


GAG 


AGG 


GAG 


78 


GAG 


GAG 


ATG 


ATG 


CAC 


TTG 


CCA 


GGG 


TGT 


GTG 


GCC 


TGT 


GTG 


117 


CGG 


AGG 


GAG 


AAT 


ACT 


TGT 


CGG 


TGG 


TGG 


GTG 


GGG 


TTG 


ACC 


156 


GCC 


ACT 


GTG 


GGG 


GGG 


GCC 


TAT 


GCC 


AAC 


GCA 


CGG 


TTA 


GAG 


195 


TCG 


ATG 


CGG 


AGG 


CAT 


GTA 


GAG 


-GTG 


ATG 


GTG 


GGT 


GGG 


GGT 


234 


ACT 


ATG 


TGT 


TGG 


GCC 


TTG 


TAG 


ATT 


GGA 


GAT 


GTG 


TGT 


GGA 


273 


GGG 


GTG 


TTG 


GTA 


GTG 


GGG 


CAG 


GTG 


TTG 


GAC 


TTG 


GGA 


CGG 


312 


CGG 


GGG 


GAG 


TGG 


ACC 


ACC 


CAG 


GAT 


TGC 


AAG 


TGC 


TCG 


ATG 


351 


TAT 


GGT 


GGT 


GAG 


GTG 


TGG 


GGG 


CAC 


AGG 


ATG 


GCC 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGG 


GGT 


AGG 


AGG 


GGG 


GTG 


ATT 


ATG 


429 


GGT 


GAG 


ATG 


TTA 


CGG 


ATG 


CGG 


TGT 


ATG 


GTA 


GGT 


GAC 


TTG 


468 


GTG 


AGG 


GGG 


GGT 


CAC 


TGG 


GGA 


GTT 


GTT 


GGT 


GGT 


CTA 


GGT 


507 


TTG 


TTG 


AGG 


ATG 


GAG 


AGT 


AAC 


TGG 


GGG 


AAG 


GTG 


ATG 


CTG 


546 


GTG 


CTA 


TTG 


GTG 


TTT 


GCC 


GGG 


GTG 


GAG 


GGA 








576 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 



wo 96/05315 



PCT/US!tt/10398 



- 104 - 



. (B) TYPE: nucleic acid 

(C) STRANDEDNESS : "Single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
5 (C) INDIVIDUAL ISOLATE: Z6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



10 



15 



GTT 


AAC 


TAT 


CGC 


AAT 


GCC 


TCG 


GGC 


GTC 


TAT 


CAC 


GTC 


ACC 


39 


AAC 


GAC 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


GTG 


TAT 


GAG 


GCC 


GAA 


78 


CAC 


CAG 


ATC 


TTA 


CAC 


CTC 


CCA 


GGG 


TGC 


TTG 


CCC 


TGT 


GTG 


117 


AGG 


GTT 


GGG 


AAT 


CAG 


TCA 


CGC 


TGC 


TGG 


GTG 


GCC 


CTT 


ACT 


156 


CCC 


ACC 


GTG 


GCG 


GTG 


TCT 


TAT 


ATC 


GGT 


GCT 


CCG 


CTT 


GAC 


195 


TCC 


CTC 


CGG 


AGA 


CAT 


GTG 


GAC 


CTG 


ATG 


GTG 


GGC 


GCC 


GCT 


234 


ACT 


GTA 


TGC 


TCT 


GCC 


CTC 


TAC 


GTT 


GGA 


GAT 


CTG 


TGC 


GGT 


273 


GGT 


GCA 


TTC 


TTG 


GTT 


GGC 


CAG 


ATG 


TTC 


TCC 


TTC 


CAG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


ACT 


ACG 


CAG 


GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAC 


GCA 


GGG 


CAT 


ATC 


ACG 


GGC 


CAC 


AGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGT 


CCC 


ACA 


ACC 


ACC 


CTG 


CTT 


CTC 


429 


GCC 


CAG 


GTC 


ATG 


AGG 


ATC 


CCT 


AGC 


ACT 


CTG 


GTA 


GAT 


CTA 


468 


CTC 


GCT 


GGA 


GGG 


CAC 


TGG 


GGC 


GTC 


CTT 


GTT 


GGG 


TTG 


GCG 


507 


TAC 


TTC 


AGT 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAA 


GTC 


ATC 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TTC 


GCT 


GGA 


GTT 


GAT 


GCC 








576 



(2) INFORMATION FOR SEQ ID NO: 43: 
20 (i) SEQUENCE CHARACTERISTICS: 

^ (A) LENGTH: /^^576 base pairs r\ 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



GTC 


AAC 


TAT 


CAC 


AAT 


GCC 


TCG 


GGC 


GTC 


TAT 


CAC 


ATC 


ACC 


39 


AAC 


GAC 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


ATG 


TAT 


GAG 


GCC 


GAA 


78 


CAC 


CAC 


ATC 


CTA 


CAC 


CTC 


CCA 


GGG 


TGC 


GTA 


CCC 


TGT 


GTG 


117 


AGG 


GAG 


GGG 


AAC 


CAG 


TCA 


CGC 


TGC 


TGG 


GTG 


GCC 


CTT 


ACT 


156 


CCC 


ACC 


GTG 


GCG 


GCG 


CCT 


TAT 


ATC 


GGT 


GCA 


CCG 


CTT 


GAA 


195 


TCC 


ATC 


CGG 


AGA 


CAT 


GTG 


GAC 


CTG 


ATG 


GTA 


GGC 


GCT 


GCT 


234 


ACA 


GTG 


TGC 


TCC 


GCT 


CTC 


TAC 


ATT 


GGG 


GAC 


CTG 


TGC 


GGT 


273 


GGC 


GTA 


TTT 


TTG 


GTT 


GGT 


CAG 


ATG 


TTT 


TCT 


TTC 


CAG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


ACT 


ACG 


CAG 


GAC 


TGC 


AAT 


TGT 


TCC ATC 


351 


TAT 


GCG 


GGG 


CAC 


GTT 


ACA 


GGC 


CAC 


AGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


TkAC 


TGG 


AGT 


CCC 


ACA 


ACC 


ACC 


TTG 


GTC 


CTC 


429 


GCC 


CAG 


GTT 


ATG 


AGG 


ATC 


CCT 


AGC 


ACT 


CTG 


GTG 


GAC 


CTA 


468 
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CTC ACT. GGA -GGG CAC TGQ- GGT ATC CTT ATC GGG GTG GCA 507 
TAG TTC TGC ATG CAA GCT AAT TGG GCC AAG GTC ATT CTG 546 
GTC CTT TTC CTC TAC GCT GGA GTT GAT GCC 576 



(2) INFORMATION FOR SEQ ID NO: 44: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK13 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



TAC 


AAC 


TAT 


CGC 


AAC 


AGC 


TCG 


GGT 


GTC 


TAC 


CAT GTC 


ACC 


39 


AAC 


GAT 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


GTC 


TAT 


GAA 


ACC 


GAT 


78 


TAC 


CAC 


ATC 


TTA 


CAC 


CTC 


CCG 


GGA 


TGC 


GTT 


CCT 


TGC 


GTG 


117 


AG6 


6AA 


GGG 


AAC 


AAG 


TCT 


ACA 


TGC 


TGG 


GTG 


TCT 


CTC 


ACC 


156 


CCC 


ACC 


GTG 


GCT 


GCG 


CAA 


CAT 


CTG 


AAT 


GCT 


CCG 


CTT 


GAG 


195 


TCT 


TTG 


AGA 


CGT 


CAC 


GTG 


GAT 


CTG 


ATG 


GTG 


GGC 


GGC 


GCC 


234 


ACT 


CTC 


TGC 


TCC 


GCC 


CTC 


TAC 


ATC 


GGA 


GAC 


GTG 


TGT 


GGG 


273 


GGT 


GTG 


TTC 


TTG 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


CAA 


CCT 


312 


CGC 


CGC 


CAC 


TGG 


ACC 


ACC 


CAA 


6AC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAC 


ACA 


GGA 


CAT 


ATC 


ACA 


GGA 


CAC 


AGA 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


AGC 


CCC 


ACT 


GCG 


ACG 


CTG 


GTC 


CTC 


429 


GCC 


CAA 


CTT 


ATG 


AGG 


ATC 


CCA 


GGC 


GCC 


ATG 


GTC 


GAC 


CTis' 


468 


CTT 


GCA 


GGC 


GGC 


CAC 


TGG 


GGC 


ATT 


CTG 


GTT 


GGC 


ATA 


GCG 


507 


TAC 


TTC 


AGC 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAG 


GTT ATC 


CTG 


546 


GTC 


CTG 


TTT 


CTC 


TTT 


GCT 


GGA 


GTC 


GAC 


GCT 
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25 (2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAl 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:45: 

GTT CCC TAC CGG AAT GCC TCT GGG GTT TAC CAT GTC ACC 39 

AAT GAC TGC CCA AAC TCC TCC ATA GTC TAC GAG GCT GAT 78 

AGC CTG ATC TTG CAC GCA CCT GGC TGC GTG CCC TGT GTC 117 
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o 



AGG 


CAA 


"GAT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC- 


ACC 


156 


CCC 


ACA 


CTG 


TCA 


GCC 


CCG 


ACC 


TTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGA 


GCT 


234 


GCT 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGC 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


CTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


ACC 


ACA 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


CTG 


ATG 


429 


GCC 


CAG 


ATG 


CTA 


CGG 


ATC 


CCC 


CAG 


GTG 


GTC 


ATA 


GAC 


ATC 


468 


ATA 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTT 


GCC 


GCC 


GCA 


507 


TAC 


TTT 


GCG 


TCG 


GCC 


GCC 


AAC 


TGG 


GCT 


AAG 


GTA 


GTG 


CTG 


546 


GTT 


CTG 


TTC 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:46: 



20 



25 



GTT 


CCC 


TAC 


CGA 


AAC 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC. P ATA 


GTT 


TAC 


GAG 


GCT 


GAT 


78 


AAC 


CTG 


ATC 


TTG 


CAT 


GCA 


CCT 


GGT 


TGC 


GTG 


CCT 


TGT 


GTC 


117/A 


AGG 


CAA 


GAT 


AAT 


GTC 


AGT 


AAG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACG 


TTG 


TCA 


GCC 


CCG 


AAT 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACG 


GCC 


TTG 


CTG 


ATG 


429 


GCC 


CAG 


TTG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATC 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTT 


GCC 


GCC 


GCA 


507 


TAT 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


ATA 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 
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30 (2) INFORMATION FOR SEQ ID NO: 47: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
{ C ) STRANDEDNESS : S ingl e 
(D) TOPOLOGY: linear 

ORIGINAL SOURCE: 



35 



(i) 



(vi) 
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(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 



GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTC 


TAC 


GAG 


GCT 


GAT 


78 


AAC 


CTG 


ATT 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AAG 


GAA 


GGT 


AAT 


GTC 


AST 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


TT6 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GTC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTC 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGC 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


GTG 


CTA 


CGG 


ATT 


CCC 


CAA 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GTC 


GCA 


507 


TAC 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


CTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 








576 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 



25 



GTT 


CCT 


TAC 


CGG 


AAT 


GCC 


TCT 


GGG 


GTG 


TAT 


CAT 


GTT 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTC 


TAT 


GAG 


GCT 


GAT 


78 


GAC 


CTG 


ATC 


CTA 


CAC 


GCA 


CCT 


GGC 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


CGG 


AAG 


GAT 


AAT 


GTC 


AGT 


AGA 


TGC 


TGG 


GTT 


CAT 


ATC 


ACC 


156 


CCC 


ACA 


CTA 


TCA 


GCC 


CCG 


AGC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAT 


TAC 


TTG 


GCG 


GGA 


GGG 


GCC 


234 


GCC 


CTG 


TGC 


TCC 


GCG 


TTA 


TAC 


GTC 


GGA 


GAC 


GTG 


TGC 


GGG 


273 


GCA 


TTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


GCT 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACT 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCC 


GCG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAA 


ATG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCT 


GCA 


507 


TAC 


TTC 


GCG 


TCG 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTT 


GAT 


GCC 








576 
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(2 ) INFORMATION -POR- .SBQ ID .NO: 49 : 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA7 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCC 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCG 


AAC 


TCT 


TCC 


ATA 


GTC 


TAT 


GAG 


GCT 


GAC 


78 


AAC 


CTG 


ATC 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AGA 


CAA 


AAT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


TTG 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


CTA GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCG 


CTA 


TAC 


GTC 


GGG 


GAG 


GCG 


TGC 


GGG 


273 


GCA 


Gl-G 


TTT 


TTG 


GTA 


GGC 


CAG 


ATG 


TTC 


AGC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TGC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


TTG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATC 


GAC ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCC 


GCA 


507 


TAT 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 








576 



12) INFORMATION FOR SEQ ID^ NO: 50: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 





(Xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO:50: 




GTT 


CCC 


TAC 


CGA AAT 


GCC 


TCT 


GGG GTT TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA AAC 


TCT 


TCC 


ATC 


GTC TAC 


GAG 


GCT 


GAT 


78 


GAC 


CTG ATC 


TTA CAC 


GCA 


CCT 


GGT 


TGC GTG 


CCC 


TGT 


GTT 


117 


AGG 


CAG 


GGT 


AAT GTC 


AGT 


AGG 


TGC 


TGG GTC 


CAG 


ATC 


ACC 


156 


CCC 


ACA 


CTG 


TCA GCC 


CCG 


AGC 


CTC 


GGA GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG GCC 


GTT 


GAC 


TAC 


TTA GCG 


GGG 


GGG 


GCT 


234 


GCC 


CTT 


TGC 


TCC GCG 


TTA 


TAC 


GTC 


GGA GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG GTA 


GGT 


CAA 


ATG 


TTC ACC 


TAT 


AGC 


CCT 


312 


CGC 


CGG 


CAT 


AAT GTT 


GTG 


CAG 


GAC 


TGC AAC 


TGT 


TCC 


ATT 


351 
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TAC AGT GGC CAC ATC ACC GGC CAC 
ATG ATG ATG "AAT TGG TCA CCT ACA 
GCC CAG TTG TTA CGG ATT CCC CAG 
ATT GCC GGG GCC CAC TGG GGG GTC 
TAC TAC GCG TCG GCG GCT AAC TGG 
GTC CTG TTT CTG TTT GCG GGG GTC 



CGG 


ATG 


GCA TGG 


GAC 


390 


ACA 


GCT TTG 


-GTG 


ATG 


429 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


TTG 


TTC 


GCC 


GCC 


GCA 


507 


GCC 


AAG 


GTT 


GTG 


CTG 


546 


GAT 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 51: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
.(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



25 



CTT 


ACC 


TAC 


GGC 


AAC 


TCC 


AGT 


GGG 


CTA 


TAC 


CAT 


CTC ACA 


39 


AAT 


GAT 


TGC 


CCC 


AAC 


TCC 


AGC 


ATC 


GTG 


CTG 


GAG 


GCG 


GAT 


78 


GCT 


ATG 


ATC 


TTG 


CAT 


TTG 


CCT 


GGA 


TGC 


TTG 


CCT 


TGT 


GTG 


117 


AGG 


GTC 


GAT 


GAT 


CGG 


TCC 


ACC 


TGT 


TGG 


CAT 


GCT 


GTG 


ACC 


156 


CCC 


ACC 


CTG 


GCC 


ATA 


CCA 


AAT 


GCT 


TCC 


ACG 


CCC 


GCA ACG 


195 


GGA 


TTC 


CGC 


AGG 


CAT 


GTG 


GAT 


CTT 


CTT 


GCG 


GGC 


GCC 


GCA 


234 


GTG 


GTT 


TGC 


TCA 


TCC 


CTG 


TAC 


ATC 


GGG 


GAC 


CTG 


TGT 


GGC 


273 


TCT 


CTC 


TTT 


TTG 


GCG 


GGA 


CAA 


CTA 


TTC 


ACC 


TTT 


CAG 


CCC 


312 


CGC 


CGT 


CAT 


TGG\ 


ACT 


GTG 


CAA 


GAC 


TGC 


AAC 


TGC 


TCC fATC 


351 


TAT 


ACA 


GGC 


CAC 


GTC 


ACC 


GGC 


CAC 


AGG 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCC 


ACA 


ACC 


ACT 


CTG 


GTC 


CTA 


429 


TCT 


AGC 


ATC 


TTG 


AGG 


GTA 


CCT 


GAG 


ATT 


TGT 


GCG 


AGT 


GTG 


468 


ATA 


TTT 


GGT 


GGC 


CAT 


TGG 


GGG 


ATA 


CTA 


CTA 


GCC 


GTT 


GCC 


507 


TAC 


TTT 


GGC 


ATG 


GCT 


GGC 


AAC 


TGG 


CTA 


AAA 


GTT 


CTG 


GCT 


546 


GTT 


CTG 


TTC 


CTA 


TTT 


GCA 


GGG 


GTT 


GAA 


GCA 
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(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: iinknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK7 



35 



SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
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Tyr Gin Val 


Arg -Asn 


-Ser 


Thr 


Gly 


Leu- 


Tyr^ 


His. 


Val -Thr. Asn 


Asp 








5 










10 






15 


Cys 


Pro 


Asn 


Ser Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp Ala He 


Leu • 






20 










25 






30 


His 


Thr 


Pro 


Gly Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Gly Asn Val 


Ser 
45 




Cys 


Tro 


Val Ala 


Met 


Thr 


Pro 


Thr 


Val 


Ala 


Thr Arg Asp 


Gly 




50 










55 






60 


Lys 


Leu 


Pro 


Thr Ala 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp Leu Leu 


Val 






65 










70 






75 


Glv 


Ser 


Ala 


Thr Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp Leu 


Cys 






80 










85 






90 


Glv 


Ser 


Val 


Phe Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe Ser Pro 


Arg 






95 








100 






105 


Arg 


His 


Trp 


Thr Thr 


Gin 


Gly 


Cys 


Asn 


Cys 


Ser 


He Tyr Pro 


Gly 






110 










115 






120 


His 


He 


Thr 


Gly His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met Met Asn 


Trp 
135 


Ser 


Pro 


Thr 


Thr Ala 
140 


Leu 


Val 


Val 


Ala 


Gin 
145 


Leu 


Leu Arg He 


Pro 
150 


Gin 


Ala 


He 


Leu Asp 
155 


Met 


He 


Ala 


Gly 


Ala 
160 


His 


Trp Gly Val 


Leu 
165 


Ala 


Gly 


He 


Tlla Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp Ala Lys 


Val 






170 










175 






180 


Leu 


Val 


Val 


Leu Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Ala 





(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CmStACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 



35 





(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO;53: 




Tyr 


Gin 


Val 


Arg Asn 


Ser 


Ser Gly Leu Tyr His 


Val Thr Asn 


Asp 






5 




10 




15 


Cys 


Pro 


Asn 


Ser Ser 


He 


Val Tyr Glu Ala Ala 


Asp Ala He 


Leu 






20 




25 




30 


His 


Ser 


Pro 


Gly Cys 


Val 


Pro Cys Val Arg Glu 


Gly Asn Ala 


Ser 








35 




40 




45 


Lys 


Cys 


Trp 


Val Ala 


Val 


Ala Pro Thr Val Ala 


Thr Arg Asp 


Gly 


50 




55 




60 


Lys 


Leu 


Pro 


Ala Thr 


Gin 


Leu Arg Arg His He 


Asp Leu Leu 


Val 






65 




70 




75 



wo 96/05315 



PCT/US95n0398 



111 - 



10 



15 



20 



25 



30 



Leu 


Tyr 


Val 


Gly Asp Leu Cys 




85 




90 


Leu 


Phe 


Thr 


Phe Ser Pro Arg 




100 




105 


Asn 


Cys 


Ser 


lie Tyr Pro Gly 




115 




120 


Trp 


Asp 


Met 


Met Met Asn Trp 


130 




135 


Ala 


Gin 


Leu 


Leu Arg lie Pro 




145 




150 


Gly 


Ala 


His 


Trp Gly Val Leu 


160 




165 


Val 


Gly 


Asn 


Trp Ala Lys Val 




175 




180 


Gly 


Val 


Asp 


Ala 




190 







80 
Leu 

95 
Thr 
110 
His 
125 
Ala 
140 
Asp 
155 
Tyr 
170 
Leu 
185 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DRl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 



35 



His 


Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr Asn Asp 






5 










10 






15 


Cys 


Pro 


Asn 


Ser 


Ser 
20 


He 


Val 


Tyr 


Glu 


Ala 
25 


Ala 


Asp 


Ala He Leu 
30 


His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Ala Ser 








35 










40 






45 


Arg 


Cys 


Trp 


Val 


Ala 
50 


Val 


Thr 


Pro 


Thr 


Val 
55 


Ala 


Thr 


Arg Asp Gly 
60 


Lys 


Leu 


Pro 


Thr 


Thr 
65 


Gin 


Leu 


Arg 


Arg 


His 
70 


He 


Asp 


Leu Leu Val 
75 


Gly 


Ser 


Ala 


Thr 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 

90 


Gly 


Ser 


Val 


Phe 


Leu 
95 


Val 


Gly 


Gin 


Leu 


Phe 
100 


Thr 


Phe 


Ser Pro Arg 
105 


Arg 


His 


Trp 


Thr 


Thr 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Pro Gly 
120 


His 


lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 




125 










130 






135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu 


Arg He Pro 




140 










145 






150 



W0 96A)5315 



PCTAJS95/10398 



- 112 - 

e 

Gin Ala lie Leu Asp Met lie Ala Gly Ala His .Trp Gly Val Leu 

155 160 165 

Ala Gly lie Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 • 

Val Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala 

185 190 

5 

(2) INFOKMI^TION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAMDEDNESS : unknown 
Q (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



His 


Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr Asn Asp 










5 










10 






15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala He Leu 










20 










25 






30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Thr Ser 










35 










40 






45 


Arg 


Cys 


Trp 


Val 


Ala 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg Asp Gly 










50 










55 






60 


Lys 


Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu Leu Val 










65 










70 




75' 


Gly 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 










80 










85 






90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 








95 










100 






105 


His 


His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 










110 










115 






120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 










125 










130 






135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu 


Arg He Pro 










140 










145 






150 


Gin 


Ala 


He 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 










155 










160 






165 


Ala 


Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 








170 










175 






180 


Leu 


Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 





185 190 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 



wo 96/05315 



PCrAJS95/10398 



- 113 - 



e 



(B) TYPE: amino acid 

(C) . STRANDEDNESS : . unknown 

(D) TOPOLCXSY: unknown 



5 



(vi) 



ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S14 



(Xi) 



SEQUENCE 



DESCRIPTION: SEQ ID 



NO:56: 



10 



15 



Tyr 


Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr Asn Asp 






5 










10 






15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Ala He Leu 








20 










25 






30 


His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Thr Ser 








35 










40 






45 


Arg 


Cys 


Trp 


Val 


Ala 


Met 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg Asp Gly 




50 










55 






60 


Lys 


Leu 


Pro 


Ala 


Thr 


Gin 


Leu 


Arg 


Arg 


Tyr 


He 


Asp 


Leu Leu Val 








65 










70 






75 


Gly 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 








95 










100 






105 


Arg 


Leu 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 






110 










115 






120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser 


Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Val 


Ala 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


Gin 


A2a 


He 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp ^Gly Val Leu 










155 










160 






165 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp Ala Lys Val 










170 










175 






180 


Leu 


Val 


Val 


Leu 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
19 0 


Asp 


Ala 





25 

(2) INFORMATION FOR SEQ ID NO: 57: 



30 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

{ C ) STRANDEDNESS : unknown 
( D ) TOPOLOGY : unknown 



(Vi) 



ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SI 8 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 57: 



wo 96/05315 



PCr/US95/10398 



- 114 - 

o 





Tyr 


Gin 


Val 


Arg 


Asn 

- 5 


Ser 


Thr 




Leu 


Tyr His 
10 


Val Thr Asn Asp 
15 




Cys 


Pro 


Ash 


Ser 


Ser 
20 


He 


Val 


Tyr 


Glu 


Thr Ala 
25 


Asp Thr He Leu 
30 




His 


Ser 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg Glu 
40 


Gly Asn Ala Ser 
45 




Arg 


Cys 


Trp 


Val 


Pro 


Val 


Ala 


Pro 


Thr 


Val Ala 


Thr Arg Asp Gly 


5 










50 










55 


60 




Lys 


Leu 


Pro 


Ala 


Thr 
65 


Gin 


Leu 


Arg 


Arg 


His He 
70 


Asp Leu Leu Val 
75 




Gly 


Ser 


Ala 


Thr 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr Val 
85 


Gly Asp Leu Cys 
90 




' Gly 


Ser 


Val 


Phe 


Leu 
95 


Val 


Ser 


Gin 


Leu 


Phe Thr 
100 


He Ser Pro Arg 
105 


1 A 


Arg 


His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys Ser 


He Tyr Pro Gly 










110 










115 


120 




His 


He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp Met 
130 


Met Met Asn Trp 
135 




Ser 


Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


He 


Ala 


Gin Leu 
145 


Leu Arg Val Pro 
150 




Gin 


Ala 


Val 


Leu 


Asp 
155 


Met 


He 


Ala 


Gly 


Ala His 
160 


Trp Gly Val Leu 
165 


15 


Ala 


Gly 


He 


Ala 


Tyr 
170 


Phe 


Ssr 


Met 


Ala 


Gly Asn 
175 


Trp Ala Lys Val 
180 




Leu 


Leu 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val Asp 


Ala 












185 








190 





(2) INFORMATION FOR SEQ ID NO: 58: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids /'V 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknovm 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

2S (A) ORGANISM: homosapiens 

(C) IJ7DIVIDUAL ISOLATE: SWl 



30 



35 





(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO: 58: 


Tyr 


Gin Val 


Arg Asn 


Ser 


Ser Gly Leu Tyr His 


Val Thr Asn Asp 






5 




10 


15 


Cys 


Pro Asn 


Ser Ser 


He 


Val Tyr Glu Thr Ala 


Asp Ala He Leu 






20 




25 


30 


His 


Ser Pro 


Gly Cys 


Val 


Pro Cys Val Arg Glu 


Asp Gly Ala Pro 






35 




40 


45 ■ 


Lys 


Cys Trp 


Val Ala 


Val 


Ala Pro Thr Val Ala 


Thr Arg Asp Gly 






50 




55 


60 


Lys 


Leu Pro 


Ala Thr 


Gin 


Leu Arg Arg His He 


Asp Leu Leu Val ■ 






65 




70 


75 


Gly 


Ser Ala 


Thr Leu 


Cys 


Ser Ala Leu Tyr Val 


Gly Asp Leu Cys 






80 




85 


90 



y/0 96/05315 



PCT/US95/10398 



115 



10 



20 



25 



30 



Gly-Ser -Val Phe. 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe Ser 


Pro Arg 


95 










100 






105 


Arg His Trp Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr 


Pro Gly 


110 










115 






120 


His lie Thr Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met 


Asn Trp 


125 








130 






135 


Ser Pro Thr Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu Arg 


He Pro 




140 










145 






150 


Gin Ala Val Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp Gly 


Val Leu 




155 










160 






165 


Ala Gly lie Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp Ala 


Lys Val 


170 










175 






180 


Leu lie Val Leu 


Leu 


Leu 


Phe 


Ser 


Gly 


Val 


Asp 


Ala 






185 










190 









(2) INFORMATION FOR SEQ ID NO: 59: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: USll 



(xi) 



35 



.) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


N0:59: 




val 




Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val^ Thr 


Asn Asp 




5 










10 






15 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp Ala 


He Leu 






20 








25 






30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly Asn 


Ala Ser 




35 










40 






45 


Trp 


val 


Ala 


Met 


Thr 


Pro 


Thr 


Val 


Ala 


Thr Arg 


Asp Gly 




50 










55 






60 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp Leu 


Leu Val - 






65 










70 






75 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp 


Leu Cys 






80 








85 






90 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe Ser 


Pro Arg 




95 








100 






105 


Trp 


Thr 


Thr 


Gin 


Gly 


Cys 


Asn 


Cys 


Ser 


He Tyr 


Pro Gly 




110 










115 






120 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met 


Asn Trp 




125 








130 






135 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu Arg 


He Pro 




140 










145 






150 


He 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp Gly 


Val Leu 






155 










160 






165 



wo 96/05315 



PCT/US95/10398 



- 116 - 

e 

Ala Gly He Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

■ Leu Val Val Leu Leu Leu Phe Ala- Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 60: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

. (vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 



Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 








20 










25 






30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asp 


Asn Ser Ser 










35 










40 






45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Gly 








50 










55 






60 


Asn 


Val 


Pro 


Thr 


Thr 


Ala 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 










65 










70 




75 


Gly 


Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 








85 






/ 90 


Gly 


Ser 


Val 


Phe 


Leu 


He 


Ser 


Gin 


Leu 


Phe 


Thr 


Leu 


Ser Pro Arg 








95 










100 






105 


Arg 


His 


Glu 


Thr 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 








110 










115 






120 


His 


Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg He Pro 










140 










145 






150 


Gin 


Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 










155 










160 






165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 








170 










175 






180 


Leu 


He 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 












185 










190 







(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGili: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 



WO9d/053i5 



PCT/US95/10398 



10 



15 



20 



- 117 - 



(D) TOPOLOGY: un]cnown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: D3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 





Val 




Asn 


Val 


Ser 


Gly 


Val 


Tyr 


Gin 


Val 


Thr Asn Asp 




5 










10 






15 




/van 




Ser 

0 WX 


lie 


Val 


Tvr 


Glu 


Thr 


Ala 


Asp 


Met He Met 






20 










25 






30 




f ^ w 


Glv 


Cys 


Val 


Pro 


Cvs 


Val 


Arq 


Glu 


Asp 


Asn Ser Ser 






35 










40 






45 






Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Ser 




50 










55 






60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


Hie 


Val 


Asp 


Leu Leu Val 








65 








70 






75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 






110 










115 






120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 








130 






135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg He Pro 






140 










145 






150 


Gin Ala 


Val 


Val 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly Val Leu 
165 


Ala Gly 


Leu 


,Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 




170 










175 




180 


Leu lie 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 





25 (2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS-: . - 

(A) LENGTH: 192 amino acids 

(B) TYPE: cunino acid 

(C) STRANDEDNESS : unloiown 

( D ) TOPOLOGY : unknovna 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DKl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

5 



oc Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 
33 c 10 15 
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0 



) 



Cys 


Ser 


Asn 


Ser 


-Ser 


-11 


Val 


-Tyr 


-Glu 


Ala ,Val 


.Asp Val -He. Met 








20 










25 




30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn Asn His S r 










35 










40 




45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala Arg Asn Ala 










50 










55 




60 


Ser 


He 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp Leu Leu Val 










65 










70 




75 


Gly 


Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly Asp Leu Cys 










80 










85 




90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe Ser Pro Arg 








95 










100 




105 


Arg 


His 


Glu 


Thr 


Ala 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr Pro Gly 










110 










115 




120 


His 


Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 










125 










130 




135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Leu 


Ser 


Gin 


Leu 


Leu Arg He Pro 










140 










145 




150 


Gin 


Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp Gly Val Leu 










155 










160 




165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Ala 


Gly 


Asn 


Trp Ala Lys Val 








170 










175 




180 


Leu 


He 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 










185 










190 







5 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

0 (A) LENGTH: 192 amino acids 

(B) TYPe\ amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
3 (C) INDIVIDUAL ISOLATE: HK3 



e 





(xi) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


NO: 63: 


Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


He 


Tyr 


His 


Val Thr Asn Asp 








5 










10 




15 


Cys 


Ser 


Asn 


Ser 


Ser 


Val 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp Met He Met 








20 










25 




30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn Asn Ser Ser 








35 










40 




45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala Arg Asn Val 




50 










55 




60 


Ser 


Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp Leu Leu Val 










65 










70 




75 


Gly 


Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly Asp Leu Cys 








80 










85 




90 
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119 



10 



20 



25 



30 



Gly Ser Val 


Phe Leu Val 


Ser Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 


95 








100 






105 


Arg His Glu 


Thr Val Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr Pro Gly 


110 








115 






120 


His Val Ser 


Gly His Arg 
125 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro Thr 


Ala Ala Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg He Pro 


140 








145 






150 


Gin Ala Val 


Val Asp Met 


Val 


Ala 


Gly 


Ala 


His 


Trp Gly Val Leu 




155 








160 






165 


Ala Gly Leu 


Ala Tyr Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 


170 








175 






180 


Leu He Val 


Met Leu Leu 
1B5 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 





(2) INFORMATION FOR SEQ ID NO: 64: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: aitiino acid 

15 (C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



35 



His 


Glu 


Val 


His 


Asn 


Val 


Ser 


Gly 


He 


Tyr 


His 


Val 


Thr 


Asn Asp 






5 










10 








15 


Cys 


Ser 


Asn 


Ser 


Ser 
20 


He 


Val 


Tyr 


Glu 


Ala 
25 


Ala 


Asp 


Met 


He Met 
30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn 


Asn 


Ser Ser 






35 










40 








45 


Arg 


Cys 


Trp 


Val 


Ala 

50 


Leu 


Thr 


Pro 


Thr 


Leu 
55 


Ala 


Ala 


Arg 


Asn Ala 
60 


Ser 


He 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu Val 






65 










70 








75 


Gly Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 
85 


Val 


Gly 


Asp 


Leu Cys 
90 


Gly Ser 


Val 


Phe 


Leu 
95 


Val 


Ser 


Gin 


Leu 


Phe 
100 


Thr 


Phe 


Ser 


Pro Arg 
105 


Arg 


His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro Gly 






110 








115 








120 


His 


Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 










130 








135 


Ser 


Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


Leu Pro 






140 










145 








150 


Gin 


Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val i-ieu 






155 










160 








165 
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o 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 
170 175 180 

Leu He Val Met Leu Leu Phe Ala Gly Val Asp Gly 
185 190 

(2) INFORMATION FOR SEQ ID NO: 65: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : untaiown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 



15 



20 



25 



30 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly Val 


Tyr 


His 


Val 


Thr 


Asn Asp 




5 








10 








15 


Cys Ser 


Asn 


Leu 


Ser 


He 


Val 


Tyr Glu 


Thr 


Thr 


Asp 


Met 


He Met 






20 








25 








30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys Val 


Arg 


Glu 


Asn 


Asn 


Ser Ser 






35 








40 








45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Ala 


Pro Thr 


Leu 


Ala 


Ala 


Arg 


Asn Ala 




50 








55 








60 


Ser Val 


Pro 


Thr 


Thr 
65 


Ala 


He 


Tlrg Arg 


His 
70 


Val 


Asp 


Leu 


Leu Val 
75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala Met 


Tyr 


Val 


Gly 


Asp 


Leu Cys 
90 






80 






85 








Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin Leu 


Phe 


Thr 


Phe 


Ser 


Pro Arg 






95 








100 








105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys Asn 


Cys 


Ser 


He 


Tyr 


Pro Gly 






110 








115 








120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala Trp 


Asp 


Met 


Met 


Met 


Asn Trp 






125 








130 








135 


Ser Pro 


Thr 


Thr 


Ala 


-Leu 


Val 


Val Ser 


Gin 


Leu 


Leu 


Arg 


He Pro 






140 








145 








150 


Gin Ala 


Val 


Val 


Asp 


Met 


Val 


Ala Gly 


Ala 


His 


Trp 


Gly Val Leu 






155 








160 








165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met Val 


Gly 


Asn 


Trp 


Ala 


Lys Val 






170 






175 








180 


Leu He 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala Gly 


Val 
190 


Asp 


Gly 







(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Asn Val Ser Gly lie Tyr His Val Thr Asn Asp 
5 10 15 

Ser lie Val Tyr Glu Thr Ala Asp Met He Met 
20 25 30 

Cvs Met Pro Cys Val Arg Glu Asn Asn Ser Ser 
35 40 45 

Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Val 
50 55 60 

Thr Thr He Arg Arg His Val Asp Leu Leu Val 
65 70 75 

Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cys 
80 85 90 

Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg 
95 100 105 

Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 
110 115 120 

His Arg Met Ala Trp Asp Met Met Met Asn Trp 
125 130 135 

Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 
140 145 150 

Asp Met Val Ala Gly Ala His Trp Gly Val Leu 
155 160 165 

Tyr Tyr Ser Met Val Gly Asn .Trp Ala Lys Val 
170 175 180 

Leu Leu Phe Ala Gly Val Asp Gly 
185 190 

(2) INFORMATION FOR SEQ ID NO: 67: 

25 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: IND5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 
•* 5 10 15 

Cvs Ser Asn Ser Ser He Val Tyr Glu Ala Ala Asp Met He Met 
35 ' 20 25 30 





Tyr Glu 


Val 


Arg 




Cys Ser 


Asn 


Ser 




His Thr 


Pro 


Gly 


10 


Arg Cys 


Trp 


Val 




Ser Val 


Pro 


Thr 




Gly Ala 


Ala 


Ala 




Gly Ser 


Val 


Phe 


15 


Arg His 


Glu 


Thr 




His Val 


Ser 


Gly 




Ser Pro 


Thr 


Thr 




Gin Ala 


He 


val 


20 


Ala Gly 


Leu 


Ala 




Leu He 


Val 


Met 
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His Thr 


Pro Gly Cys Val Pro C^a 


Val 


Arg 


Glu 


Gly Asn Ser Ser 




35 




40 




45 


Arg Cys 


Trp Val Ala Leu Thr Pro 


Thr 


Leu 


Ala 


Ala Arg Asn Ala - 


50 




55 




60 


Ser Val 


Ser Thr Thr Thr lie Arg 


His 


His 


Val 


Asp Leu Leu Val 




65 




70 




75 


61y Ala 


Ala Ala Phe Cys Ser Ala 


Met 


Tyr 


Val 


Gly Asp Leu Cys 


80 




85 




90 


Gly Ser 


Val Phe Leu Val Ser Gin 


Leu 


Phe 


Thr 


Phe Ser Pro Arg 


95 




100 




105 


Arg His 


Glu Thr Val Gin Asp Cys 


Asn 


Cys 


Ser 


lie Tyr Pro Gly 


110 




115 




120 


His Val 


Ser Gly His Arg Met Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 




125 




130 




135 


Ser Pro 


Thr Ala Ala Leu Val Val 


Ser 


Gin 


Leu 


Leu Arg lie Pro 




140 




145 




150 


Gin Ala 


Val Val Asp Met Val Ala 


Gly 


Ala 


His 


Trp Gly lie Leu 




155 




160 




165 


Ala Gly 


Leu Ala Tyr Tyr Ser Met 


Val 


Gly 


Asn 


Trp Ala Lys Val 


170 




175 




180 


Leu lie 


Val Met Leu Leu Phe Ala 


Gly 


Val 


Asp 


Gly 




185 




190 







(2) INFORMATION FOR SEQ ID NO: 68: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : unknown 

(D) TOiPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Aisp 






5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Met He Met 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Phe Ser 






35 










40 






45 


Ser Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Ala 




50 










55 






60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 










70 






75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 
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O 



Arg 


His 


Glu 


Thr 


Val 
110 


Gin 


Asp. 


Cys 


J^n. 


Cys Ser He. Tyr Pro. Gly 
115 1-20 


His 


Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp Met Met Met Asn Trp 








125 










130 135 


Ser 


Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin Leu Leu Arg He Pro 
145 150 


Gin 


Ala 


Val 


Val 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala His Trp Gly He Leu 
160 165 


Ala 


Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Val 


Gly Asn Trp Ala Lys Val 
175 180 


Leu 


He 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val Asp Gly 



185 190 



jQ (2) INFORMATION FOR SBQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : un}cno%m 

(D) TOPOLOGY: unJcnown 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: PIO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His Val 


Thr Asn Asp 








5 










10 




15 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala Aisp 


Met He Met 






20 










25 


30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu Asn 


Asn Ser Ser 








35 










40 




45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala Ala 


Arg Asn Ser 




50 










55 




60 


Ser 


Val 


Pro 


Thr 


Thr 
65 


Ala 


He 


Arg 


Arg 


His 
70 


Val Asp 


Leu Leu Val 
75 


Gly 


Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val Gly 


Asp Leu Cys 








80 










85 




90 


Gly 


Ser 


Val 


Leu 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr Phe 


Ser Pro Arg 








95 










100 




105 


Arg 


His 


Trp 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser He 


Tyr Pro Gly 






110 










115 




120 


His 


Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met 


Met Asn Trp 








125 










130 




135 


Ser 


Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu Leu 


Arg He Pro 
150 


Gin 


Ala 


He 


Leu 


Asp 
155 


Val 


Val 


Ala 


Gly 


Ala 
160 


His Trp 


Gly Val Leu 
165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn Trp 


Ala Lys Val 








170 










175 




180 


Leu 


He 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp Gly 
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-( 2 ) INFORMATION^ FOR SEQ ■ ID NO :'70 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Ala 


Tyr 


His 


Val 


Thr 


Asn Asp 








5 










10 








15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Val 


He Met 








20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Glu 


Gly 


Asn 


Ser Ser 








35 










40 






45 


Gin Cys 


Trp 


Val 


Ai.a 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Tlla 


Arg 


Asn Ala 








50 










55 






60 


Thr Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu Val 








65 










70 






75 


Gly Ala 


Ala 


Val 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu Cys 








80 










65 








90 


Gly Ser 


Val 


Phe 


Leu 


He 


Ser 


Gin 


Leu 


Phe 


Thr 


He 


Ser 


Pro Arg 








95 










100 








105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asn 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro Gly 








110 


A 








115 








A 120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asri Trp 








125 










130 








135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


He Pro 








140 










145 






150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val Leu 








155 










160 




165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 




Ala 


Lys Val 








170 










175 








180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 







185 190 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) • SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : iinknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
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(Xi) 

Tyr Glu Val Arg Asn Val 

Cys Ser Asn Ser Ser lie V 

20 

His Thr Pro Gly Cys Val P 

35 

Arg Cys Trp Val Ala Leu T 

50 

Ser Val Pro Thr Thr Thr I 

65 

Gly Ala Ala Ala Phe Cys S 

80 

Gly Ser Val Phe Leu Val S 

95 

Arg His Glu Thr Val Gin A 

110 

His Val Thr Gly His Arg ti 

125 

Ser Pro Thr Ala Ala Leu V 

140 

Gin Ala Val Val Asp Met V 

155 

Ala Gly Leu Ala Tyr Tyr S 

170 

Leu lie Val Met Leu Leu E 

,\ 185 

(2) INFORMATION FOR SEQ 
(i) 





His 


Val Thr* Asn AsD 


T n 




15 




Val 
yaix 


Ik cm Val Tie Leu 






30 






zicsn Asn Ser Ser 

AOU nOAJl 


H\J 










Jil a At'CT Asn Ser 

AdbO *vLy noA^ 








His 


Val 
vox 


Asn Leu Leu Va 1 


70 




75 


Tyr 


Val 


Gly Asp Leu Cys 


85 




90 


Phe 


Thr 


Phe Ser Pro Arg 


100 




105 


Cys 


Ser 


lie Tyr Pro Gly 


115 




120 


Asp 


Met 


Met Met Asn Trp 


130 




135 


Gin 


Leu 


Leu Arg lie Pro 


145 




150 


Ala 


His 


Trp Gly Val Leu 


160 




165 


Gly 


Asn 


Trp Ala Lys Val 


175 




180 


Val 


Asp 


Gly 


190 







SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknovm 

(D) TOPOLOGY: unknown 

ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SAIO 

SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Arg Asn Val Ser Gly Met Tyr His Val Thr Asn Asp 
5 10 15 

Ser Ser lie Val Tyr Glu Ala Ala Asp Met He Met 

20 25 30 

Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser 

35 40 45 

Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ser 

50 55 60 



- 125 - 

(C) INDIVIDUAL ISOLATE: ;S45 
SEQUENCE DESCRIPTION: SEQ ID N0:7l! 
: Gly Ala 
L Tyr Glu 
) Cys val 
: Pro Thr 
i Arg Arg 
r Ala Met 
r Gin Leu 
3 Cys Asn 
: Ala Trp 
L Val Ser 
L Ala Gly 
r Met Val 
e Ala Gly 

D NO: 72: 



(vi) 

(xi) 
Tyr Glu Val 
Cys Ser Asn 
His Thr Pro 
Arg Cys Trp 
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10 



Ser Val 


Pro 


Thr 


Thr 


Thr 


He 








65 






Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 






80 






Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 






95 






Arg Tyr 


Glu 


Thr 


Val 


Gin 


Asp 








110 






Arg Val 


Thr 


Gly 


His 


Arg 


Met 








125 






Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 








140 






Gin Ala 


He 


Val 


Asp 


Met 


Val 








155 






Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 








170 






Leu lie 


Val 


Met 


Leu 


Leu 


Phe 



185 



126 - 



Arg 


■Arg 


His Val Asp 


Leu 


Leu Val 






70 




75 


Ala 


Met 


Tyr Val Gly 


Asp 


Leu Cys 






85 




90 


Gin 


Leu 


Phe Thr Phe 


Ser 


Pro Arg 






100 




105 


Cys 


Asn 


Cys Ser He 


Tyr 


Pro Gly 






115 




120 


Ala 


Trp 


Asp Met Met 


Met 


Asn Trp 






130 




135 


Val 


Ser 


Gin Leu Leu 


Arg 


He Pro 






145 


150 


Ala 


Gly 


Ala His Trp 


Gly 


Val Leu 






160 




165 


Met 


Val 


Gly Asn Trp 


Ala 


Lys Val 






175 




180 


Ala 


Gly 


Val Asp Gly 










190 







(2) INFORMATION FOR SEQ ID NO: 73: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) . STRANDEDNESS : linknovm 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) 1 INDIVIDUAL ISOLATE: SW2 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 






20 










25 




30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Ala 


Asn Ser Ser 








35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Thr 






50 










55 






60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 










70 




75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Val 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 








110 










115 






120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 



125 130 135 
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-Ser Pro Thr Ala Ala Leu Val Val- Ser. Gln Leu .Leu-Arg -lie Pro 

140 145 150 

Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

Leu lie Val Met Leu Leu Phe Ala Gly Val Asp Gly 

185 190 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 aznino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


Tyr 


Val 


Thr Asn Asp 




5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Ser 


Asn Ser Ser 






35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Ala 




50 










55 






60 


Ser Val 


Pro 


Thr 


Lys 
65 


Thr 


lie 


Arg 


Arg 


His 
70 


Val 


Asp 


Leu Leu Val 
75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Net 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 






110 










115 






120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 










130 






135 


Ser Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu 


Leu 


Arg lie Pro 
150 


Gin Ala 


Val 


Val 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly Val Leu 
165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 






170 








175 






180 


Leu lie 


Val 


Leu 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 





35 
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e 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE- CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknovm 

(D) TOPOLOGY: unknown 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: TIO 



(xi) 



10 


Tyr Glu 


Val 


Arg 




Cys Ser 


Asn 


Ser 




His Thr 


Pro 


Gly 




Arg Cys 


Trp 


Val 


15 


Ser Val 


Pro 


Thr 




Gly Ala 


Ala 


Ala 




Gly Ser 


Val 


Phe 




Arg His 


Glu 


Thr 


20 


His Leu 


Ser 


Gly 




Ser Pro 


Thr 


Thr 




Gin Ala 


Val 


Met 




Ala Gly 


Leu 


Ala 


25 


Leu lie 


Val 


Met 



lEQUI 


SNCE 


DESCRIPTION 


Asn 


Val 








5 

Ser 


He 


Val 


Phe 


Glu 


20 










Cys 


Val 


Pro 


Cys 


Val 


35 










Ala 


Leu 


Thr 


Pro 


Thr 


50 










Thr 


Thr 


He 


Arg 


Arg 


65 










Phe 


Cys 


Ser 


Ala 


Met 


80 








Leu 


Val 


Ser 


Gin 


Leu 


95 










Leu 


Gin 


Asp 


Cys 


Asn 


110 










His 


Arg 


Met 


Ala 


Trp 


125 










Ala 


Leu 


Val 


Val 


Ser 


140 










Asp 


Met 


Val 


Thr 


Gly 


155 










Tyr 


Tyr 


Ser 


Met 


Ala 


170 










Leu 


Leu 


Phe 


Ala 


Gly 



185 



SEQ ID 


NO: 75: 




Tyr 




Val Thr Asn 


Asp 


10 






15 


Ala 


Ala 


Asp Leu He 


Met 


25 




30 


Arg 


Glu 


Gly Asn Ser 


Ser 


40 






45 


Leu 


Ala 


Ala Arg Asn 


Thr 


55 




60 


His 


Val 


Asp Leu Leu 


Val 


70 






75 


Tyr 


Val 


Gly Asp Leu 


Cys 


85 






90 


Phe 


Thr 


Phe Ser Pro 


Arg 


100 






105 


Cys 


Ser 


He Tyr Pro 


Gly 


115 






120 


Asp 


Met 


Met Met Asn 


Trp 


130 






135 


Gin 


Leu 


Leu Arg He 


Pro 


145 






150 


Ala 


His 


Trp Gly Val 


Leu 


160 






165 


Gly 


Asn 


Trp Ala Lys 


Val 


175 






180 


Val 


Asp 


Gly 




190 









(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



35 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 6 
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10 



15 



20 



(xi) 


- SEQUENCE 


DESCRIPTION: 


:. .SEQ ID 


NO: 76: 




Tyr Glu 


Val 


Arg 


Asn 


val 


Ser 


vjiy 


wee 


Tyr 




Val Thr 


Asn Asp 




b 










T ft 






' 15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


vai 


Tyr 


UlU 


Aia 


Aia 


Asp Met 


lie Met 






















30 


His Thr 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


vai 


Arg 


Pill 

olU 


Asn Asn 


Ser Ser 






35 










Aft 






45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Lieu 


Aia 


Ala Arg 


Asn Ala 




50 










do 






60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


lie 


1i ■n rm 

Arg 


Arg 


HXS 


vai 


Asp Leu 


Leu Val 








65 
















75 


Gly Ala 


Ala 


Thr 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly Asp 


Leu Cys 






80 










Ob 






90 


Gly Ser 


Val 


Phe 


Leu 


lie 


Ser 


Gin 


Leu 


Fne 


inr 


Phe Ser 


Pro Arg 






95 










100 






105 


Gin His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


lie Tyr 


Pro Gly 








110 








115 






120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met 


Asn Trp 






125 








130 






135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu Arg 


lie Pro 








140 










145 






150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp Gly 


Val Leu 








155 










160 






165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp Ala 


Lys Val 






170 










175 






180 


Leu lie 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 










185 










190 









(2) INFORMATION FOR SEQ ID NO: 77:^ 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
25 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

30 Ala Gin Val Arg Asn Thr Ser Arg Gly Tyr Met Val Thr Asn Asp 

5 10 15 

Cys Ser Asn Glu Ser lie Thr Trp Gin Leu Gin Ala Ala Val Leu 

20 25 30 

His Val Pro Gly Cys He Pro Cys Glu Arg Leu Gly Asn Thr Ser 

35 40 45 

Arg Cys Trp He Pro Val Thr Pro Asn Val Ala Val Arg Gin Pro 
35 50 55 60 
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Gly Ala 


Leu 


Thr 


Gin Gly 


Leu 






65 






Met Ser 


Ala 


Thr 


Leu 


"Cys 


Ser 








80 






Gly Gly 


Val 


Met 


Leu 


Ala 


Ala 






95 






Arg His 


Trp 


Phe 


Val 


Gin 


Glu 




110 






Thr lie 


Thr 


Gly 


His 


Arg 


Met 








125 






Ser Pro 


Thr 


Ala 


Thr 


Met 


He 








140 






Glu Val 


He 


He 


Asp 


He 


He 








155 






Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 






170 






He Val 


He 


Leu 


Leu 


Leu 


Ala 



185 



130 - 



Arg 


Thr 


His 
70 


He 


Asp 


Met 


Val 


Val 
75 


Ala 


Leu" 


Tyr 
85 


Val 


Gly 


Asp 


Leu 


Cys 
90 


Gin 


Met 


Phe 
100 


He 


Val 


Ser 


Pro 


Arg 
105 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr 


Pro 


Gly 
120 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Leu 


Ala 


Tyr 


Ala 


Met 


Arg Val 


Pro 






145 










150 


Gly 


Gly 


Ala 


His 


Trp 


Gly Val 


Met 






160 










165 


Met 


Gin 


Gly 
175 


Ala 


Trp 


Ala 


Lys 


Val 
180 


Ala 


Gly 


Val 
190 


Asp 


Ala 









(2) INFORMATION FOR SEQ ID NO: 78: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE: T4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 



25 



30 



Ala 


Gin 


Val 


Lys 


Cys 


Ser 


Asn 


Asp 


His 


Val 


Pro 


Gly 


Arg 


Cys 


Trp 


He 


Gly Ala 


Leu 


Thr 


Met 


Ser 


Ala 


Thr 


Gly Gly 


Val 


Met 


His 


His 


Trp 


Phe 


Thr 


He 


Thr 


Gly 


Ser 


Pro 


Thr 


Ala 



Asn 


Thr 


Thr 


Asn 


Ser 


5 

Ser 


He 


Thr 


Trp 


Gin 


20 










Cys 


Val 


Pro 


Cys 


Glu 


35 










Pro 


Val 


Ser 


Pro 


Asn 


50 










Gin 


Gly 


Leu 


Arg 


Thr 


65 










Leu 


Cys 


Ser 


Ala 


Leu 


80 








Leu 


Ala 


Ala 


Gin 


Met 


95 










Val 


Gin 


Asp 


Cys 


Asn 


110 










His 


Arg 


Met 


Ala 


Trp 


125 










Thr 


Met 


He 


Leu 


Ala 



140 



Tyr 


Met 


Val 


Thr 


Asn 


Asp 


10 










15 


Leu 


Gin 


Ala 


TQa 


Val 


Leu 


25 










30 


Lys 


Thr 


Gly 


Asn 


Thr 


Ser 


40 








45 


Val 


Ala 


Val 


Arg 


Gin 


Pro 


55 








60 


His 


He 


Asp 


Met 


Val 


Val 


70 








75 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 


85 










90 


Phe 


He 


Val 


Ser 


Pro 


Gin 


100 










105 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 


115 










120 


Asp 


Met 


Met 


Met 


Asn 


Trp 


130 










135 


Tyr 


Ala 


Met 


Arg Val 


Pro 


145 










150 
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Glu Val lie Leu Asp II Val Ser Gly Ala His Trp Gly Val Met 

155 160 165 

Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val 

170 175 180 

Val Val lie Leu Leu Leu Ala Ala Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
*0 (D) TOPOLOGY: unlcnown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 



15 



20 



25 



30 





(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 79: 




Ala 


Glu 


Val 


Lys 


Asn 


Thr 


Ser 


Thr 


Ser 


Tyr 


Met 


Val 


Thr 


Asn Asp 








5 










10 








15 


Cys 


Ser 


Asn 


Asp 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Gin 


Ala 


Ala 


Val Leu 






20 










25 








30 


His 


Val 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Arg 


Val 


Gly 


Asn 


Ala Ser 








35 










40 








45 


Arg 


Cys 


Trp 


He 


Pro 


Val 


Ser 


Pro 


Asn 


Val 


Ala 


Val 


Gin 


Arg Pro 




50 










55 








60 


Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


lie 


Asp 


Met 


Val Val 










65 










70 








75 


Met 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu Cys 










80 








85 








90 


Gly 


Gly 


Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He 


He 


Ser 


Pro Gin 






95 










100 








105 


His 


His 


Trp 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 




Pro Gly 








110 










115 








120 


Thr 


lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn Trp 








125 










130 








135 


Ser 


Pro 


Thr 


Thr 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 


Met 


Arg Val Pro 








140 










145 








150 


Glu 


Val 


He 


He 


Asp 


He 


He 


Ser 


Gly 


Ala 


His 


Trp 


Gly Val Met 








155 










160 








165 


Phe 


Gly 


Leu 


TVla 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 


Trp 


Ala Lys Val 








170 










175 








180 


Val 


Val 


He 


Leu 


Leu 


Leu 


Thr 


Ala 


Gly 


Val 


Asp 


Ala 














185 










190 











35 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) -LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknovm 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

S (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: USIO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 



Val 


Gin 


Val 


Lys 


Asn 


Thr 


Ser 


Cys 


Ser 


Asn 


Asp 


5 

Ser 


He 


Thr 








20 






His 


Val 


Pro 


Gly 


Cys 


Val 


Pro 










35 






Arg 


Cys 


Trp 


He 


Pro 


Val 


Ser 










50 






Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 










65 






Met 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 










80 




Gly Gly 


Met 


Met 


Leu 


Ala 


Ala 










95 






His 


His 


Ser 


Phe 


Val 


Gin 


Glu 










110 






Thr 


He 


Thr 


Gly 


His 


Arg 


Met 










125 






Ser 


Pro 


Thr 


Ala 


Thr 


Leu 


He 










140 






Glu 


Val 


He 


He 


Asp 


He 


He 










155 






Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 








170 






Val 


Val 


He 


Leu 


Leu 


Leu 


Ala 



25 185 



Thr 


Ser 


Tyr 
10 


Met 


Val 


Thr 


Asn Asp 
15 


Trp 


Gin 


Leu 


Glu 


Ala 


Ala 


Val Leu 




25 








30 


Cys 


Glu 


Lys 
40 


Val 


Gly 


Asn 


Thr Ser 
45 


Pro 


Asn 


Val 
55 


Ala 


Val 


Gin 


Arg Pro 
60 


Arg 


Thr 


His 


He 


Asp 


Met 


Val Val 






70 






75 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp 


Phe Cys 
90 


Gin 


Met 


Phe 
100 


He 


Val 


Ser 


Pro Arg 
105 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr 


Pro Gly 
120 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn Trp 
135 


Leu^ 


.Ala 


Tyr 


Val 


Met 


Arg 


Val Pro 


145 








150 


Ser Gly 


Ala 


His 


Trp 


Gly 


Val Leu 






160 








165 


Met 


Gin 


Gly 
175 


Ala 


Trp 


Ala 


Lys Val 
180 


Ala 


Gly 


Val 
190 


Asp 


Ala 







(2) INFORMATION FOR SEQ ID NO; 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 



wo 96/05315 



PCTAJS95/ia398 



Val 


. Glu 


:val 


Arg 


Asn 
5 


He. 


„Ser 


Cys 


Ser 


Asn 


Asn 


Ser 


He 


Thr 








20 






His 


Leu 


Pro 


Gly 


Cys 


Val 


Pro 








35 






Arg 


Cys 


Trp 


He 


Gin 


Val 


Thr 






50 






Gly 


Ala 


Leu 


Thr 


His 


Asn 


Leu 








65 






Met 


Ala 


Ala 


Thr 


Val 


Cys 


Ser 










80 






Gly Ala 


Val 


Met 


He 


Val 


Ser 










95 






Arg 


His 


Asn 


Phe 


Thr 


Gin 


Glu 








110 






His 


He 


Thr 


Gly 


His 


Arg 


Met 










125 






Ser 


Pro 


Thr 


Leu 


Thr 


Met 


He 










140 






Glu 


Leu 


Ala 


Leu 


Gin 


Val 


Val 










155 






Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 








170 






He 


Ala 


He 


Leu 


Leu 


Leu 


Val 



185 



133 - 



Ser 


Ser 


Tyr 


Tyr 


Ala 


Thr 


Asn Asp 






10 










15 


Trp 


Gin 


Leu 


Thr 


Asp 


Ala 


Val 


Leu 




25 










30 


Cys 


Glu 


Asn 


Asp 


Asn 


Gly 


Thr 


Leu 




40 










45 


Pro 


Asn 


Val 


Ala 


Val 


Lys 


His 


Arg 






55 










60 


Arg 


Thr 


His 


Val 


Asp 


Val 


He 


Val 




70 










75 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Val 


Cys 






85 










90 


Gin 


Ala 


Leu 


He 


He 


Ser 


Pro 


Glu 






100 










105 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Gin Gly 






115 










120 


Ala 


Trp 


Asp 


Met 


Met 


Leu 


Asn Trp 




130 










135 


Leu 


Ala 


Tyr 


Ala 


Ala 


Arg 


Val 


Pro 






145 










150 


Phe 


Gly 


Gly 


His 


Trp 


Gly 


Val 


Val 




160 










165 


Met 


Gin 


Gly 


Ala 


Trp 


Ala 


Lys 


Val 






175 










180 


Ala 


Gly 


Val 


Asp 


Ala 









190 



(2) INFORMATION FOR SEQ ID NO: 82: 

20 

(i) A SEQUENCE CHARACTERISTICS: ,^ 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : un]cnown 

( D ) TOPOLOGY : untaiown 

2* (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DKll 

(xi) 



30 



35 



SEQUENCE 


DESCRIPTION; 


; SEQ ID 


NO: 82: 






Arg Asn 


Thr 


Ser Ser 


Ser 


Tyr Tyr 


Ala Thr 


Asn Asp 


5 








10 






15 


Asn Ser 


He 


Thr Trp 


Gin 


Leu Thr 


Asn Ala 


Val 


Leu 


20 






25 






30 


Gly Cys 


Val 


Pro Cys 


Glu 


Asn Asp 


Asn Gly 


Thr 


Leu 


35 








40 






45 


He Gin 


Val 


Thr Pro 


Asn 


Val Ala 


Val Lys 


His 


Arg 


50 








55 






60 


Thr His 


Asn 


Leu Arg 


Ala 


His He 


Asp Met 


He 


Val 


65 






70 






75 
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O 



Met Ala 


Ala 


Thr 


Val 

eo 


Cys 


Ser 


Ala 


Leu 


Tyr Val Gly Asp Val . Cys 
85 -90 


Gly Ala 


Val 


Met 


He 
95 


Vai 


Ser 


Gin 


Ala 


Phe He Val Ser Pro Glu 
100 105 • 


His His 


His 


Phe 


Thr 
110 


Gin 


Glu 


Cys 


Asn 


Cys Ser He Tyr Gin Gly 
115 120 


His lie 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp Met Met Leu Asn Trp 
130 135 


Ser Pro 


Thr 


Leu 


Thr 
140 


Met 


He 


Leu 


Ala 


Tyr Ala Ala Arg Val Pro 
145 150 


Glu Leu 


Val 


Leu 


Glu 
155 


Val 


Val 


Phe 


Gly 


Gly His Trp Gly Val Val 
160 165 


Phe Gly 


Leu 


Ala 


Tyr 
170 


Phe 


Ser 


Met 


Gin 


Gly Ala Trp Ala Lys Val 
175 180 


He Ala 


He 


Leu 


Leu 


Leu 


Val 


Ala 


Gly 


Val Asp Ala 








185 








190 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknovm 

(D) TOPOLOGY: unknovm 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homos api ens 
(C) INDIVIDUAL ISOLATE: SW3 



20 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83-: 

/'A 1'^ 



35 



Val Glu 


Val 


Arg 


Asn 
5 


He 


Ser 


Ser 


Ser 


Tyr 
10 


Tyr 


Ala 


Thr Asn Asp 
15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Thr 


Asn 


Ala Val Leu 






20 








25 






30 


His Leu 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Glu 


Asn 
40 


Asp 


Asn 


Gly Thr Leu 
45 


His Cys 


Trp 


He 


Gin 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val 


Lys His Arg 








50 










55 






60 


Gly Ala 


Leu 


Thr 


His 
65 


Asn 


Leu 


Arg 


Ala 


His 
70 


Val 


Asp 


Met 11 a Val 
75 


Met Ala 


Ala 


Thr 


Val 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Met Cys 
90 


Gly Ala 


Val 


Met 


He 


Val 


Ser 


Gin 


Ala 


Phe 


He 


He 


Ser Pro Glu 






95 










100 






105 


Arg His 


Asn 


Phe 


Thr 
110 


Gin 


Glu 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Gin Gly 
120 


Arg He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Leu Asn Trp 
135 


Ser Pro 


Thr 


Leu 


Thr 
140 


Met 


He 


Leu 


Ala 


Tyr 
145 


Ala 


Ala 


Arg Val Pro • 
150 


Glu Leu 


Val 


Leu 


Glu 
155 


Val 


Val 


Phe 


Gly 


Gly 
160 


His 


Trp 


Gly Val Val 
165 
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Phe Gly Leu Ala Tyr Phe Ser ^Met. Gln Gly Ala Trp Ala Lys Val 
170 175 180 

lie Ala lie Leu Leu Leu Val Ala Gly Val Asp Ala 
185 190 



5 (2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 ainino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

0 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



Val Glu 


Val 


Arg 


Asn 


Thr 


Ser 


Phe 


Ser 


Tyr 


Tyr 


Ala 


Thr 


Asn Asp 






5 










10 








15 


Cys Ser 


Asn 


Asn 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Thr 


Asn 


Ala 


Val Leu 






20 










25 








30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Asn 


Asp 


Asn 


Gly 


Thr Leu 






35 










40 








45 


Arg Cys 


Trp 


He 


Gin 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val 


Lys 


His Arg 




50 










55 








60 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Thr 


His 


Val 


Asp 


Val 


He Val 






65 










70 








75 


Met Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Val Cys 








80 








65 








90 


Gly Ala 


Val 


Met 


He 


Ala 


Ser 


Gin 


Ala 


Phe 


He 


He 


Ser 


Pro Glu 






95 










100 








105 


Arg His 


Asn 


Phe 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Gin Gly 






110 










115 








120 


His lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Leu 


Asn Trp 








125 










130 








135 


Ser Pro 


Thr 


Leu 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 


Ala 


Arg 


Val Pro 






140 










145 








150 


Glu Leu 


Val 


Leu 


Glu 


Val 


Val 


Phe 


Gly 


Gly 


His 


Trp 


Gly 


Val Val 






155 










160 








165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 


Trp 


Ala 


Lys Val 






170 










175 








180 


He Ala 


He 


Leu 


Leu 


Leu 


Val 


Ala 


Gly 


Val 


Asp 


Ala 
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(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
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. (B) -TYPE: amino, acid 

(C) STRANDEDNESS: unknovni 

(D) TOPOLOGY: unkno%ra 

(vi) ORIGINAL SOURCE: 

(A} ORGANISM: homosapiens 
S (C) INDIVIDUAL ISOLATE: S83 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 



20 



Val Glu 


Val 


Lys 


Asp 


Thr 


Gly 


Asp 


Ser 


Tyr 


Met Pro Thr Asn Asp 








5 










10 


15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Trp 


Gin 


Leu 


Glu Gly Ala Val Leu 








20 










25 


30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Arg 


Thr Ala Asn Val Ser 








35 










40 


45 


Arg Cys 


Trp 


Val 


Pro 


Val 


Ala 


Pro 


Asn 


Leu 


Ala He Ser Gin Pro 








50 










55 


60 


Gly Ala 


Leu 


Thr 


Lys 


Gly 


Leu 


Arg 


Ala 


His 


He Asp He He Val 








65 










70 


75 


Met Ser 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val Gly Asp Val Cys 








80 










85 


90 


Gly Ala 


Leu 


Met 


Leu 


Ala 


Ala 


Gin 


Val 


Val 


Val Val Ser Pro Gin 








95 










100 


105 


His His 


Thr 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser He Tyr Pro Gly 








110 










115 


120 


Arg lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met Met Asn Trp 








125 










130 


135 


Ser Pro 


Thr 


Thr 


Thr 


Met 


Leu 


Leu 


Ala 


Tyr 


Leu Val Arg He Pro 








140 










145 


150 


Glu Val 


He 


Leu 


Asp 


He 


Val 


Thr 


Gly 


Gly 


His Trp Gly Val Met 








155 










160 


165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ser Trp Ala Lys Val 








170 










175 


180 


He Val 


He 


Leu 


Leu 


Leu 


Thr 


Ala 


Gly 


Val 


Glu Ala 



185 190 



25 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUEWCS CKJy?ACTERlSTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unJcnovm 
30 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

35 
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5 



10 



Leu, Glu 


Trp 


Arg 


Cys Ser 


Asn 


Ser 


His Tixr 


Pro 


Gxy 


Thr Cys 


Trp 


TJir 


Gly Ala 


Thr 


Thr 


Gly Ala 


Ala 


Thr 


Gly Ala 


Vai 


Fne 


Arg His 


Gin 


Thr 


His Leu 


Ser 


Gly 


Ser Pro 


Ala 


Val 


Gin Thr 


Leu 


Phe 


Ala Gly 


Leu 


Ala 


Ala He 


He 


Met 
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Asn 


Val 


Ser 


Gly 


Leu 


- 5 
Ser 


lie 


Val 


Tyr 


Glu 


20 










Cys 


Val 


Pro 


Cys 


Val 


35 










Ser 


Val 


Thr 


Pro 


Thr 


50 










Ala 


Ser 


He 


Arg 


Ser 


65 










Met 


Cys 


Ser 


Ala 


Leu 


80 










Leu 


Val 


Gly 


Gin 


Ala 


95 










Val 


Gin 


Thr 


Cys 


Asn 


110 










His 


Arg 


Met 


Ala 


Trp 


125 










Gly 


Met 


Val 


Val 


Ala 


140 










Asp 


He 


He 


Ala 


Gly 


155 










Tyr 


Tyr 


Ser 


Met 


Gin 


170 










Val 


Met 


Phe 


Ser 


Gly 



185 



Tyr Val Leu Thr Asn Asp 


10 


15 


Ala Asp Asp Val 


He Leu 


25 


30 


Gin Asp Gly Asn 


Thr Ser 


40 


45 


Val Ala Val Arg 


Tyr Val 


55 


60 


His Val Asp Leu 


Leu Val 


70 


75 


Tyr Val Gly Asp 


Val Cys 


85 


90 


Phe Thr Phe Arg 


Pro Arg 


100 


105 


Cys Ser Leu Tyr 


Pro Gly 


115 


120 


Asp Met Met Met 


Asn Trp 


130 


135 


His Val Leu Arg 


Leu Pro 


145 


150 


Ala His Trp Gly 


He Met 


160 


165 


Gly Asn Trp Ala 


Lys Val 


175 


180 


Val Asp Ala 




190 





(2) INFORMATION FOR SEQ ID NO: 87: 

20 (i) SEQUENCE CHARACTERISTICS: 

/MA) LENGTH: 192 amino acids i'\ 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HKIO 

(xi) 



30 



35 



) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 87: 




Trp 


Arg 


Asn 


Val 


Ser 


Gly 


Leu 


Tyr 


Val 


Leu Thr 


Asn Asp 


5 










10 






15 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp Val 


He Leu 






20 








25 






30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly Asn 


Thr Ser 




35 










40 






45 


Trp 


Thr 


Ser 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val Arg 


Tyr Val 




50 










55 






60 


Thr 


Thr 


Ala 


Ser 


He 


Arg 


Ser 


His 


Val 


Asp Leu 


Leu Val 






65 








70 






75 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp 


Met Cys 






80 








85 






90 
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10 



20 



25 



30 



138 - 



Gly Ala Val Phe Leu Val Gly -Gin-Ala Phe Thr Phe Arg .Pro Arg 

95 100 105. 

Arg His Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly. 

110 115 120 

His Leu Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Ala Val Gly Met Val Val Ala His Val Leu Arg Leu Pro 

140 145 150 

Gin Thr Leu Phe Asp lie lie Ala Gly Ala His Trp Gly lie Leu 

155 160 165 
Ala Gly Leu Ala Tyr Tyr Ser Met Gin Gly Asn Trp Ala Lys Val 

170 175 180 
Ala lie lie Met Val Met Phe Ser Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 88 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 aitdno acids 

(B) TYPE: amino acid 

15 (C) * STRANDEDNESS : unknovm 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S2 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

\u Thr\ Asn Asp 
15 

ip Val lie Leu 
30 

y Asn Thr Ser 
45 

l1 Arg Tyr Val 
60 

\p Leu Leu Val 
75 

.y Asp Met Cys 
90 

le Arg Pro Arg 
105 

su Tyr Pro Gly 
120 

it Met Asn Trp 
135 

»u Arg Leu Pro 
150 

p Gly lie Leu 

155 ' 160 165 



.) 


SEQUENCE 


Trp 


Arg 


Asn 


Thr 






5 




Asn 


Ser 


Ser 


He 






20 




Pro 


Gly 


Cys 


Val 






35 




Trp 


Thr 


Pro 


Val 




50 




Thr 


Thr 


Ala 


Ser 






65 




Ala 


Thr 


Met 


Cys 






80 




Val 


Phe 


Leu 


Val 






95 




Gin 


Thr 


Val 


Gin 






110 




Ser 


Gly 


His 


Arg 






125 




Ala 


Val 


Gly 


Met 






140 




Val 


Phe 


Asp 


He 



Ser Gly Leu Tyr Val 








10 


Val Tyr Glu Ala Asp 








25 


Pro 


Cys 


Val 


Gin Asp 








40 


Thr 


Pro 


Thr 


Val Ala 








55 


He 


Arg 


Ser 


His Val 






70 


Ser 


Ala 


Leu 


l"yr Val 








85 


Gly 


Gin 


Ala 


Phe Thr 






100 


Thr 


Cys 


Asn 


Cys Ser 






115 


Met 


Ala 


Trp 


Asp Met 








130 


Val 


Val 


Ala 


His Val 








145 


He 


Ala 


Gly 


Ala His 
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Ala Gly Leu Ala Tyr Tyr Ser Met Gin Gly Asn Trp Ala Lys Val 
170 175 180 

Ala He He Met Val Met Phe Ser Gly Val "Asp Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 89: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unlcnown 

(vi) ORIGINAL SOURCE: 

" (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 89: 






Leu 


Glu 


Trp 


Arg 


Asn 


Thr 


Ser 


Gly Leu 


Tyr 


Val 


Leu 


Thr 


Asn 


Asp 










5 








10 










15 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr Glu 


Ala 


Asp 


Asp 


Val 


He 


Leu 








20 








25 










30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys Val 


Gin 


Asp 


Gly 


Asn 


Thr 


Ser 










35 








40 










45 


Met 


Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro Thr 


Val 


Ala 


Val 


Arg Tyr 


Val 






50 








55 










60 


Gly 


Ala 


Thr 


Thr 


Ala 


Ser 


He 


Arg Ser 


His 


Val 


Asp 


Leu 


Leu 


Val 








€5 








70 










75 


Gly 


Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala Leu 


Tyr 


Val 


Gly 


Asp Met 


Cys 










80 








85 










90 


Gly 


Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin Ala 


Phe 


Thr 


Phe 


Arg 


Pro 


Arg 








95 








100 










105 


Arg 


His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys Asn 


Cys 


Ser 


Leu 


Tyr 


Pro 


Gly 








110 








115 










120 


His 


Val 


Ser 


Gly 


His 


Arg 


Met 


Ala Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 








130 










135 


Ser 


Pro 


Ala 


Val 


Gly 


Met 


Val 


Val Ala 


His 


He 


Leu 


Arg 


Leu 


Pro 










140 








145 










150 


Gin 


Thr 


Leu 


Phe 


Asp 


He 


Leu 


Ala Gly 


Ala 


His 


Tip 


Gly 


He 


Leu 










155 








160 










165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met Gin 


Gly 


Asn 


Trp 


Ala 


Lys 


Val 








170 








175 










180 


Ala 


He 


Val 


Met 


He 


Met 


Phe 


Ser Gly 


Val 


Asp 


Ala 
















185 








190 













(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

' (A) LENGTH: 192 amino acids 
(B) TYPE: amino acid 
{ C ) STRANDEDNESS : untoown 
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(P) TOPOLOGY: ixiiknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S54 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 



Leu Glu 


Trp 


Arg 


Asn 


Thr 


Ser 


Gly 


Leu 


Tyr 


He 


Leu 


Thr Asn Asp 






5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Val He Leu 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn Thr Ser 






35 










40 






45 


Thr Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg Tyr Val 




50 










55 






60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


He 


Arg 


Ser 


His 


val 


Asp 


Leu Leu Val 






€5 










70 






75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Met Cys 






80 










85 






90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg Pro Arg 






95 










100 






105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr Pro Gly 






110 










115 






120 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 










130 






135 


Ser Pro 


Ala 


Val 


Gly 


Met 


Val 


Val 


Ala 


His 


He 


Leu 


Arg Leu Pro 








140 










145 






150 


Gin Thr 


Leu 


Phe 


Asp 


He 


Leu 


Ala 


Gly 


Ala 


His 


Trp 


Gly He Leu 








155 










160 






165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Gln^Gly 


Asn 


Trp 


Ala Lys Val 






170 










175 






180 


Ala He 


He 


Met 


He 


Met 


Phe 


Ser 


Gly Val 


Asp 


Ala 










185 










190 









25 (2) INFORMATION FOR SEQ ID NO: 91: 

(i) - SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
(3) TYPE: mrdno acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: 24 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91; 



35 



Glu His Tyr Arg >isn Ala Ser Gly He Tyr His He Thr Asn Asp 

5 10 15 



wo 96/05315 
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. Cys. 


Pro 


Asn 


Ser 


Ser 


He 


Val 










20 








His 


Leu 


Pro 


Gly 


Cys 
35 


Val 


Pro 




Arg 


Cys 


Trp 


Thr 


Pro 
50 


Val 


Thr 




Gly 


Ala 


Pro 


Leu 


Glu 


Ser 


Phe 


5 








65 








Gly 


Ala 


Ala 


Thr 


Leu 
80 


Cys 


Ser 




Gly 


Gly 


Ala 


Phe 


Leu 
95 


Met 


Gly 




Arg 


His 


Trp 


Thr 


Thr 


Gin 


Glu 










110 






10 


His 


lie 


Thr 


Gly 


His 


Arg 


Met 










125 








Ser 


Pro 


Thr 


Thr 


Thr 
140 


Leu 


Leu 




Thr 


Ala 


Phe 


Leu 


Asp 
155 


Met 


Val 




Ala 


Gly 


Leu 


Ala 


Tyr 
170 


Phe 


Ser 


IS 


Val 


Leu 


Val 


Leu 


Phe 
185 


Leu 


Phe 



141 


- 








Tyr 


Glu 


Ala 


Asp His 


His He Leu 






25 




30 


Cys 


Val 


Met 


Thr Gly 


Asn Thr Ser 






40 




45 


Pro 


Thr 


Val 


Ala Val 


Ala His Pro 






55 




60 


Arg 


Arg 


His 


Val Asp 


Leu Met Val 






70 




75 


Ala 


Leu 


Tyr 


Val Gly 


Asp Leu Cys 






85 




90 


Gin 


Met 


He 


Thr Phe 


Arg Pro Arg 






100 




105 


Cys 


Asn 


Cys 


Ser lie 


Tyr Thr Gly 






115 




120 


Ala 


Trp 


Asp 


Met Met 


Met Asn Trp 






130 




135 


Leu 


Ala 


Gin 


He Met 


Arg Val Pro 






145 




150 


Ala 


Gly 


Gly 


His Trp 


Gly Val Leu 






160 




165 


Met 


Gin 


Gly 


Asn Trp 


Ala Lys Val 






175 




180 


Ala 


Gly 


Val 


Asp Ala 





190 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 ainino acids 

(B) TYPE: ainino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:92: 



Val 


His 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys 


Pro 


Asn 


Thr 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Glu 


His 


His He Met 








20 










25 






30 


His 


Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Thr 


Glu 


Asn Thr Ser 








35 










40 






45 


Arg 


Cys 


Trp 


Val 


Pro 


Leu 


Thr 


Pro 


Thr 


Val 


Ala 


Ala 


Pro Tyr Pro 




50 










55 






60 


Asn 


Ala 


Pro 


Leu 


Glu 


Ser 


Met 


Arg 


Arg 


His 


Val 


Asp 


Leu Met Val 










65 










70 






75 


Gly 


Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Phe 


Tyr 


He Gly 


Asp Leu Cys 








80 










85 






90 


Gly 


Gly 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Asp 


Phe 


Arg Pro Arg 






95 










100 






105 



W 96/05315 
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Arg His 


Trp Thr Thr Gin Asp 


Cys 


Asn 


Cys 


Ser He Tyr Pro Gly 




110 






115 


120 


His Val 


Ser Gly His Arg Met 


Ala 


Trp 


Asp 


Met Met Met Asn Trp 




125 






130 


135 


Ser Pro 


Thr Ser Ala Leu He 


Met 


Ala 


Gin 


He Leu Arg He Pro 




140 






145 


150 


Ser He 


Leu Gly Asp Leu Leu 


Thr 


Gly 


Gly 


His Trp Gly Val Leu 




155 






160 


165 


Ala Gly 


Leu Ala Phe Phe Ser 


Met 


Gin 


Ser 


Asn Trp Ala Lys Val 




170 






175 


180 


He Leu 


Val Leu Phe Leu Phe 


Ala 


Gly 


Val 


Glu Gly 




185 




190 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS : unknown 

(D) TOPOLOGY: unknown 

15 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: 26 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 



Val Asn 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Glu 


His 


Gin He Leu 






20 








25 






30 


His Leu 


Pro 


Gly 


Cys 


Leu 


' Pro 


Cys 


Val 


Arg 


Val 


Gly 


Asn Gin Ser 








35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Ser Tyr He 








50 










55 






60 


Gly Ala 


Pro 


Leu 


Asp 


Ser 


Leu 


Arg 


Arg 


His 


Val 


Asp 


Leu Met Val 






65 










70 






75 


Gly Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Gly 


Ala 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Phe 


Gin Pro Arg 






95 










100 






105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Ala Gly 








110 










115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Leu 


Leu 


Ala 


Gin 


Val 


Met 


Arg He Pro 








140 










145 






150 


Ser Thr 


Leu 


Val 


Asp 


Leu 


Leu 


Ala 


Gly 


Gly 


His 


Trp 


Gly Val Leu 








155 










160 






165 


Val Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Ala 


Asn 


Trp 


Ala Lys Val 



170 175 180 
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10 



- 143 - 

lie Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 
185 190 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: 27 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 



Val Asn 


Tyr 


His 


Asn 

5 


Ala 


Ser 


Cys Pro 


Asn 


Ser 


Ser 


He 


Met 






20 






His Leu 


Pro 


Gly 


Cys 


val 


Pro 






35 






Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 








50 






Gly Ala 


Pro 


Leu 


Glu 


Ser 


He 






65 






Gly Ala 


Ala 


Thr 


Val 


Cys 


Ser 






80 






Gly Gly 


val 


Phe 


Leu 


Val 


Gly 






95 






Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 






110 






His Val^ 


Thr 


Gly 


His 


Arg 


Met 








125 






Ser Pro 


Thr 


Thr 


Thr 


Leu 


Val 








140 






Ser Thr 


Leu 


Val 


Asp 


Leu 


Leu 








155 






He Gly 


Val 


Ala 


Tyr 


Phe 


Cys 






170 






He Leu 


Val 


Leu 


Phe 


Leu 


Tyr 



185 



Gly 


Val 


Tyr 


His 


He 


Thr 


Asn 


Asp 






10 










15 


Tyr 


Glu 


Ala 


Glu 


His 


His 


He 


Leu 




25 










30 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Gin 


Ser 




40 










45 


Pro 


Thr 


Val 


Ala 


Ala 


Pro 


Tyr 


He 






55 










60 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Met 


Val 






70 










75 


Ala 


Leu 


Tyr 


He 


Gly 


Asp 


Leu 


Cys 






85 










90 


Gin 


Met 


Phe 


Ser 


Phe 


Gin 


Pro 


Arg 






100 










105 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ala 


Gly 






115 


r\ 








120 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 






130 










135 


Leu 


Ala 


Gin 


Val 


Met 


Arg 


He 


Pro 






145 








150 


Thr 


Gly 


Gly 


His 


Trp 


Gly 


He 


Leu 






160 










165 


Met 


Gin 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 






175 








180 


Ala 


Gly 


Val 


Asp 


Ala 









190 



30 (2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 



10 



15 



30 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 95: 






Tyr Asn 






- Ann 




Ser 




Val 




Hi a 


Val Thr 


Asn Asp 










5 










X V 








15 


Cys 


Pro 




Q o V" 


OCX 


Tie 


Val 

V CL JL 


TS/T* 




Th-r 

X liX 




Tyr His 


He 


Leu 










20 










25 








30 


His 


Leu 


pro 




Vjr B 


V CLX 


Pt-o 




Val 


Arg 


V3XU 


Gly Asn 


Lys 


Ser 


























45 


Thr 


Cys 


_ 

Trp 


Va.X 


QOT- 

J. 




X Xix 


•tr i. W.> 


1 HIT 


Va 1 
VdX 


Axa 


Ala Gin 


His 


Leu 










DU 










c: e: 

3 D 








60 


Asn 


Ala 


Pro 


Leu 


Glu 


Ser 


Leu 


Arg 


Arg 


His 


Val 


Asp Leu 


Met 


Val 










65 










70 








75 


Gly 


Gly 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


He 


Gly Asp 


Val 


Cys 










80 










85 








90 


Gly 


Gly 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe Gin 


Pro 


Arg 










95 










100 








105 


Arg 


His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr 


Thr Gly 










110 










115 








120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met 


Asn 


Trp 










125 










130 








135 


Ser 


Pro 


Thr 


Ala 


Thr 


Leu 


Val 


Leu 


Ala 


Gin 


Leu 


Met Arg 


lie 


Pro 










140 










145 






150 


Gly 


Ala 


Met 


Val 


Asp 


Leu 


Leu 


Ala 


Gly 


Gly 


His 


Trp Gly 


He 


Leu 










155 










160 








165 


val 


Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Ala 


Asn 


trp Ala 


Lys 


Val 










170 










175 




180 


He 


Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 














185 










190 









20 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 



Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Ser Leu He Leu . 

20 25 30 

35 His Ala Pro Gly Cys Val Pro Cys Val Arg Gin Asp Asn Val Ser 

35 40 45 
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Arg 


Cys 


Trp 


Val 


Gin 


lie Thr Pro 


Thr 


Leu 


Ser Ala Pro Thr Phe 








50 






55 


60 


Gly 


Ala 


Val 


Thr 


Ala 


Pro Leu Arg 


Arg 


Ala 


Val Asp Tyr Leu Ala 










65 






70 


75 


Gly 


Gly 


Ala 


Ala 


Leu 


Cys Ser Ala 


Leu 


Tyr 


Val Gly Asp Ala Cys 










80 






85 


90 


Gly 


Ala 


Val 


Phe 


Leu 


Val Gly Gin 


Met 


Phe 


Thr Tyr Arg Pro Arg 








95 






100 


105 


Gin 


His 


Thr 


Thr 


Val 


Gin Asp Cys 


Asn 


Cys 


Ser He Tyr Ser Gly 










110 






115 


120 


His 


He 


Thr Gly His 


Arg Met Ala 


Trp 


Asp 


Met Met Met Asn Trp 










125 






130 


135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu Leu Met 


Ala 


Gin 


Met Leu Arg He Pro 










140 






145 


150 


Gin 


Val 


Val 


He 


Asp 


He He Ala 


Gly 


Gly 


His Trp Gly Val Leu 










155 






160 


165 


Phe 


Ala 


Ala 


Ala 


Tyr 


Phe Ala Ser 


Ala 


Ala 


Asn Trp Ala Lys Val 










170 






175 


180 


Val 


Leu 


val 


Leu 


Phe 


Leu Phe Ala 


Gly 


Val 


Asp Gly 










185 






190 





(2) INFORMATION FOR SEQ ID NO: 97: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

20 

(vi) ORIGINAL SOURCE: 

^ (A) ORGANISM : ^ ^^ homosapiens ; ;\ 

(C) INDIVIDUAL ISOLATE: SA4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 



25 


Val 


Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 




Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asn 


Leu He Leu 










20 










25 






30 




. His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Gin 


Asp 


Asn Val Ser 










35 










40 






45 




Lys 


Cys 


Trp 


Val 


Gin 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro Asn Leu 






50 










55 






60 


30 


Gly 


Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr Leu Ala 










65 










70 






75 




Gly 


Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Ala Cys 








80 










85 






90 




Gly 


Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr 


Arg Pro Arg 










95 










100 






105 




Gin 


His 


Thr 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Ser Gly 


35 










110 










115 






120 
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o 



His 


He Thr Gly His Arg 


Met 


Ala Trp 


Asp 


Met 


Met Met 


Asn 


Trp 




125 








130 








135 


Ser 


Pro Thr Thr Ala 


Leu 


Leu 


Met Ala 


Gin 


Leu 


Leu Arg 


He 


Pro 




140 








145 






150 . 


Gin 


Val Val He Asp 


He 


He 


Ala Gly 


Gly 


His 


Trp Gly Val 


Leu 




155 








160 








165 


Phe 


Ala Ala Ala Tyr 


Phe 


Ala 


Ser Ala 


Ala 


Asn 


Trp Ala 


Lys 


Val 




170 








175 




180 


He 


Leu Val Leu Phe 
185 


Leu 


Phe 


Ala Gly 


Val 
190 


Asp 


Ala 







(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
15 (C) INDIVIDUAL ISOLATE: SA5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 



Val 


Pro 


Tyr 


Arg 


Asn 
5 


Ala 


Ser 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 








20 






His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 










35 






Arg 


Cys 


Trp 


Val 


Gin 


He 


Thr 








50 






Gly 


Ala 


Val 


Thr 


Ala 


Pro 


Leu 








65 






Gly 


Gly 


Ala 


Ala 


Leu 


Cys 


Ser 










80 






Gly 


Ala 


Val 


Phe 


Leu 


Val 


Gly 








95 






Gin 


His 


Thr 


Thr 


Val 


Gin 


Asp 










110 






His 


He 


Thr 


Gly 


His 


Arg 


Met 








125 






Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 










140 






Gin 


Val 


Val 


He 


Asp 


He 


He 










155 






Phe 


Ala 


Val 


Ala 


Tyr 


Phe 


Ala 










170 






Val 


Leu 


Val 


Leu 


Phe 


Leu 


Phe 



185 



Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 






10 






15 


Tyr 


Glu 


Ala 


Asp 


Asn 


Leu He Leu 






25 






30 


Cys 


Val 


Lys 


Glu 


Gly 


Asn Val Ser 






40 






45 


Pro 


Thr 


Leu 


^Ser 


Ala 


Pro Asn Leu 






55 






60 


Arg 


Arg 


Val 


Val 


Asp 


Tyr Leu Ala 






70 






75 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Ala Cys 






85 






90 


Gin 


Met 


Phe 


Thr 


Tyr 


Arg Pro Arg 






100 






105 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Ser Gly 






115 






120 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






130 






135 


Met 


Ala 


Gin 


Val 


Leu 


Arg He Pro 






145 






150 


Ala 


Gly 


Gly 


His 


Trp 


Gly Val Leu 






160 






165 


Ser 


Ala 


•Ala 


Asn 


Trp 


Ala Lys Val 






175 






180 


Ala 


Gly 


Val 


Asp 


Gly 








190 









35 
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15 



20 
25 
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(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unlaiown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 



(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:99: 






Val Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


val 


Tyr 


HIS 


Val Thr 


Asn 


Asp 






5 










10 








15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp Leu 


He 


Leu 






20 










25 








30 


His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Lys 


Asp Asn 


Val 


Ser 








35 










40 








45 


Arg Cys 


Trp 


Val 


His 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala Pro 


Ser 


Leu 




50 










55 








60 


Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp Tyr 


Leu 


Ala 






65 










70 








75 


Gly Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp 


Val 


Cys 








80 










85 








90 


Gly Ala 


Leu 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr Arg 


Pro 


Arg 






95 










100 








105 


Gin His 


Ala 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr 


Ser 


Gly 








110 










115 








120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met 


Asn 


Trp 






125 










130 








135 


Ser Pro 


Ala 


Thr 


Ala 


Leu 


^val 


Met 


Ala 


Gin 


Met 


Leu Arg 


He 


Pro 








140 










145 






ISO 


Gin Val 


Val 


He 


Asp 


He 


He 


Ala 


Gly 


Gly 


His 


Trp Gly 


val 


Leu 








155 










160 








165 


Phe Ala 


Ala 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp Ala 


Lys 


Val 








170 










175 








180 


Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 












185 










190 











(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



35 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA7 



wo 96/05315 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 



Val 


Pro 


Tyr 


Arg 


Asn 
5 


Ala 


Ser 


Gly 


Val 


Tyr 
10 


His 


Val Thr Asn 


Asp 
15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asn Leu He 


Leu 










20 










25 




30 


His 


Ala 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Gin 


Asn Asn Val 


Ser 
45 


Arg 


Cys 


Trp 


Val 


Gin 
50 


He 


Thr 


Pro 


Thr 


Leu 
55 


Ser 


Ala Pro Asn 


Leu 
60 


Gly 


Ala 


Val 


Thr 


Ala 
65 


Pro 


Leu 


Arg 


Arg 


Ala 
70 


Val 


Asp Tyr Leu 


Ala 
75 


Gly 


Gly 


Ala 


Ala 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly Asp Ala 


Cys 
90 


Gly 


Ala 


Val 


Phe 


Leu 
95 


Val 


Gly 


Gin 


Met 


Phe 
100 


Ser 


Tyr Arg Pro 


Arg 
105 


Gin 


His 


Thr 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He Tyr Ser 


Gly 
120 


His 


He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met Met Asn 


Trp 
135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu Arg He 


Pro 










140 










145 




150 


Gin 


Val 


Val 


He 


Asp 
155 


He 


He 


Ala 


Gly 


Gly 
160 


His 


Trp Gly Val 


Leu 
165 


Phe 


Ala 


Ala 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp Ala Lys 


Val 










170 










175 




180 


Val 


Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 





185 190 
20 

(2) INFORMATION FOR SEQ ID NO: 101: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: sutiino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 



25 



(i) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

30 Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asp Leu He Leu 

20 25 30 

His Ala Pro Gly Cys Val Pro Cys Val Arg Gin Gly Asn Val Ser 

35 40 45 . 

Arg Cys Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Ser Leu 

50 55 60 
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o 



Gly 


Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val Asp Tyr Leu Ala 










65 










70 


75 


Gly 


Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val Gly Asp Ala Cys 










80 










85 


90 


Gly 


Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr Tyr Ser Pro Arg 








95 










100 


105 


Arg 


His 


Asn 


Val 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser He Tyr Ser Gly 










110 










115 


120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met Met Asn Trp 










125 










130 


135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu Leu Arg He Pro 










140 










145 


150 


Gin 


Val 


Val 


He 


Asp 


lie 


lie 


Ala 


Gly 


Ala 


His Trp Gly Val Leu 










155 










160 


165 


Phe 


Ala 


Ala 


Ala 


Tyr 


Tyr 


Ala 


Ser 


Ala 


Ala 


Asn Trp Ala Lys Val 










170 










175 


180 


Val 


Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Ala 



185 190 



(2) INFORMATION FOR SEQ ID NO: 102: 



15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknovni 

( D ) TOPOLOGY : xanknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 



Leu 


Thr 


Tyr 


Gin 


Asn 


Ser 


Ser 


Gin 


Leu 


Tyr 


His 


Leu 


Thr Asn Asp 








1 










10 






15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Leu 


Glu 


Ala 


Asp 


Ala 


Met He Leu 








20 










25 






30 


His 


Leu 


Pro 


Gin 


Cys 


Leu 


Pro 


Cys 


Val 


Arg 


Val 


Asp 


Asp Arg Ser 










35 










40 






45 


Thr 


Cys 


Trp 


His 


Ala 


Val 


Thr 


Pro 


Thr 


Leu 


Ala 


He 


Pro Asn Ala 






50 










55 






60 


Ser 


Thr 


Pro 


Ala 


Thr 


Gin 


Phe 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Ala 










65 










70 






75 


Gin 


Ala 


Ala 


Val 


Val 


Cys 


Ser 


Ser 


Leu 


Tyr 


He 


Gin 


Asp Leu Cys 










80 










85 






90 


Gin 


Ser 


Leu 


Phe 


Leu 


Ala 


Gin 


Gin 


Leu 


Phe 


Thr 


Phe 


Gin Pro Arg 










95 










100 






105 


Arg 


His 


Trp 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Thr Gin 






110 










115 






120 


His 


Val 


Thr 


Gin 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 










125 








130 






135 


Ser 


Pro 


Thr 


Thr 


Thr 


Leu 


Val 


Leu 


Ser 


Ser 


lie 


Leu 


Arg Val Pro 



140 145 150 



wo 96/05315 



PCr/lIS95/10398 



- 150 - 



Glu lie Cys Ala Ser Val lie Phe Gin Gin His Trp Gin lie Leu 
155 160 165 

Leu Ala Val Ala Tyr Phe Gin Met Ala Gin Asn Trp Leu Lys Val 



170 175 
Leu Ala Val Leu Phe Leu Phe Ala Gin Val Glu Ala 
185 190 



180 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homos api ens 
(C) INDIVIDUAL ISOLATE: DK7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 



20 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


.CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


CCG 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCA 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAG 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCT 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCT 


AGC 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGC 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTT 


ACG 


TGC 


GGC 


TTC 


290f\ 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATA 


CCG 


CTC 


GTC 


GGC 


GCC 


CCT 


429 


CTT 


GGA 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


6AA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCC 


CTG 


CTC 


546 


TCT 


TGC 


CTG 


ACC 


GTG 


CCC 


GCT 


TCG 


GCC 










573 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 11 



35 



SEQUENCE DESCRIPTION: SEQ ID NO: 104: 



WO96/0531S 
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o 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCA 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCT 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCT 


AGC 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTT ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATA 


CCG 


CTC 


GTC 


GGC 


GCC 


CCT 


429 


CTC 


GGA 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA ACA 


GGG AAC 


CTT 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


CTG 


GCC 


CTG 


CTC 


546 


TCT 


TGC 


CTG 


ACT 


GTG 


CCC 


GCT 


TCA GCC 










573 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 



ATG 


AGC 


ACG' 


^AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC AAA CGT 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCA 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAT 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCT 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCT 


AGG 


TGG 


GGG 


CCC 


ACA 


GAC 


GCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATA 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTC 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


CTA 


GCC 


CTG 


CTT 


546 


TCT 


TGC 


CTG 


ACT 


GTG 


CCC 


GCT 


TCA 


GCC 










573 



(2) INFORMATION FOR SEQ ID NO: 106: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SWl 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 



10 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


ASA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAT 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


273 


6GA 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCT 


AGC 


TGG 


GGC 


CCT 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


6AC 


CTC 


ATG 


GGG 


TAG 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCT 


429 


CTT 


GGA 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


CTG 


GCC 


CTG 


CTT 


546 


TCT 


TGC 


CTG 


ACA 


GTG 


CCC 


GCG 


TCA 


GCC 










573 



(2) INFORMATION FOR SEQ ID NO: 107: 

^- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) /STRANDEDNESS: single /\ 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
25 (C) INDIVIDUAL ISOLATE: SIS 



30 



35 







(xi) 




SEQUENCE DESCRIPTION: SEQ : 


ED NO: 107: 




ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGC 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGT 


GGC 


TCC 


CGG 


312 


CCT 


AGC 


TGG 


GGC 


CCT 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGC AAA 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCT 


429 


CTC 


GGA 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 


507 
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CCT GGT TGC TCT TTC TCT ATC TTC CTT CTG GCC CTG CTC 546 
TCT TGT CTG ACT GTG CCC GCG TCA GCT 573 



(2) INFORMATION FOR SEQ ID NO: 108: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: homosapiens 

" (C) INDIVIDUAL ISOLATE: DR4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGO 


GGC 


CCT AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCT 


AGC 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAC 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTT 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGA 


468 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GOT TTG CTC 


546 


TCT 


TGC 


TTG 


ACC 


GTG 


CCC 


GCA 


TCG 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 109: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SAIO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 39 

AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC 78 

35 GGT GGT CAG ATC GTT GGT GGA GTC TAT CTG TTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACG AGG AAG ACT 156 
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o 



TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


ccc 


AAG 


GCT 


C6C 


CAG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCC 


CAG 


234 


ccc 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCT 


429 


TTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


TTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


CCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGT 


TTA 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



20 



25 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


CAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC CCG GGT 


78 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCA'aCAA 


CCT 


CGT 


GGA 


CGG 


CGA 


CAA 


CCT 


ATp 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCC 


CAG 


234 


CCC 


GGG 


CAT 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCC 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GCT 


GCC 


A6A 


GCC 


TTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT_ 


-GCA 


ACA 


GGG 


AAT 


CTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


CTG 


CTG 


546 


TCC 


TGC 


TTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 
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30 (2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AA6 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAC 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCC 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACC 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGT 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAT 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGT 


TTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 
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15 

(2) INFORMATION FOR SEQ ID NO: 112: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

ORIGINAL SOURCE: 
(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 



20 



(i) 



(vi) 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAC 


GAG 


GGC 


ATG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCC 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


TTG 


GCG 


CAT 


GGC 


GTC 


CGG 


466 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGT 


TTG 


ACC 


ATT 


CCA 


GCT 


TCC 


GCT 
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35 
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(2) INFORMATION FOR SEQ ID NO: 113: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(i) 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: PIO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


ccc 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGC 


CTG 


ACC 


ATC 


CCA 


GCG 


TCC 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 114: ^ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DKl 



30 



35 







(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 114: 




ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


ATG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCT 


CGG 


312 
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10 



15 



25 



35 



CCT 


AGT TGG 


GGC 


CCC 


AAC GAC CCC CGG CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG GGT 


AAG 


GTC 


ATC GAT ACC CTC ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC CTC 


ATG 


GGG 


TAC ATT CCG CTC GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG GGC 


GCT 


GCC 


AGG GCC CTG GCG CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG GAG 


GAC 


GGC 


GTG AAC TAC GCA ACA 


GGG 


AAT 


TTG 


507 


CCC 


GGT TGC 


TCT 


TTC 


TCT ATC TTC CTC TTG 


GCT 


CTG 


TTG 


546 


TCC 


TGT TTG 


ACC 


ATC 


CCA GCT TCC GCC 








573 


(2) 


INFORMATION FOR SBQ ID NO: 115: 












(i) 




SEQUENCE CHARACTERISTICS: 














(A) 


LENGTH: 573 base pairs 












(B) 


TYPE: nucleic acid 














(C) 


STRANDEDNESS : s ingl e 














(D) 


TOPOLOGY: linear 












(vi) 




ORIGINAL SOURCE: 
















(A) 


ORGANISM: homosapiens 














(C) 


INDIVIDUAL ISOLATE: 


TIO 








(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 115: 




ATG 


AGC ACG 


AAT 


CCT 


AAA CCT CAA AGA AAA 


ACC 


AAA. 


CGT 


39 


AAC 


ACC AAC 


CGC 


CGC 


CCA CAG GAC GTC AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT CAG 


ATC 


GTT 


GGT GGA GTT TAC CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC CCC 


AGG 


TTG 


GGT GTG CGC GCG ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG CGG 


TCG 


CAA 


CCT CGT GGA AGG CGA 


CAG 


CCT 


ATC 


195 


CCC 


AAG GCT 


CGC 


CAG 


CCC GAG GGC AGG GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG TAG 


CCT 


TGG 


CCC CTC TAT GGC AAT 


GAG 


GGC 


ATG 


273 


GGG 


TGG GCA 


GGA 


TGG 


CTC CTG TCA CCC CGT 


GGC 


TCC 


CGG 


312 


CCT 


AGT TGG 


GGC 


CCC 


ACA GAC CGC CGG CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG GGT 


AAG 


GTC 


ATC GAT ACC CTC ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC CTC 


ATG 


GGG 


TAC ATT CCG CTC GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG GGC 


GCT 


GCC 


AGG GCT CTG GCA CAT 


GGT 


GTC 


CGG 


468 


GTT 


CTG GAG 


GAC 


GGC 


GTG AAC TAT GCA ACA 


GGG 


AAT 


TTG 


507 


CCC 


GGT TGC 


TCT 


TTT 


TCT ATC TTC CTC TTG 


GCT 


CTG 


CTG 


546 


TCT 


TGT CTG 


ACC 


ATC 


CCA GCT TCC GCT 








573 


(2) 


INFORMATION FOR SEQ ID NO: 116: 












(i) 




SEQUENCE CHARACTERISTICS: 














(A) 


LENGTH: 573 base pairs 












(B) 


TYPE: nucleic acid 














(C) 


STRANDEDNESS: single 














(D) 


TOPOLOGY: linear 












(vi) 




ORIGINAL SOURCE: 














(A) 


ORGANISM: homosapiens 














(C) 


INDIVIDUAL ISOLATE: 


SW2 








(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
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ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39. 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


7a 


GGT 


GGC 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


CGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CAG CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCT 


GGG 


TAC 


CCC 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


ATG 


273 


66A 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGC 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACT 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


GTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTC 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTG 


507 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGT 


CTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

IS (A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND3 

20 





(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 117: 




ATG 


/ ] 

AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGC 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGT 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTC 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTA 


GCT 


TTG 


CTA 


546 


TCC 


TGT 


TTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 










573 



30 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 



WO96/0f5315 



PCT/US95/10398 



- 159 - 

o 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
5 (C) INDIVIDUAL ISOLATE: IND8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 



10 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGC 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


CAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGT 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTC 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCT 


TTG 


CTA 


546 


TCC 


TGT 


TTG 


ACC 


GTC 


CCA 


GCT 


TCC 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 119: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid /A 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 



ATG 


AGC 


ACG 


AAT 


CCT 




CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTC 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCA 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CAT 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAC 


GGC 


AAT 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCT 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 



160 - 



GTT CTG GAG GAC GGC GTG AAC TAT GCA ACA GGG AAC CTC 
CCC GGT TGC TCT TTC TCT ATC TTC CTT CTG GCT TTG CTG 
TCC TGT TTG ACC ATC CCA GCT TCC GCT 



(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACC 


AGG 


AAG 


ACT 


TCA 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


CCC 


AAG 


GCT 


CGC 


CAA 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAT 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAC 


GAG 


GGC 


ATG 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCT 


CGG 


CCT 


AAT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGT 


GCC 


CCC 


CTA 


GGG 


GGC 


GTT 


GCC 


AGA 


GCC 


TTG 


GCA 


CAT 


GGT 


GTC 


CGG 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT. GCA 


ACA 


GGG 


AAT 


TTA 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


TCC 


TGC 


TTG 


ACC 


ACC 


CCA 


GCT 


TCC 


GCT 











(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: „ 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 
AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC 
GGT GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC 



wo 96/05315 



PCT/US95/10398 
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o 



AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACC 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA AGG 


CGA 


CAA 


CCT ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGA 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAT 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


ATG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CAT 


GGC 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GTT 


GCC 


AGA 


GCC 


CTG 


GCA 


CAC 


GGT 


GTC 


CGG 


466 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAC 


GCA 


ACA 


GGG 


AAT 


ATA 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGT 


CTG 


ACC 


ACC 


CCA 


GTT 


TCC 


GCT 










573 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 



20 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAG 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGC 


CAG 


ATC 


GTC 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG/TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


AGT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CAA 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


ATG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


TTA 


GGG 


GGC 


GTT 


GCC 


AGA 


GCC 


CTG 


GCA 


CAT 


GGT 


GTC 


CGG 


468 


GTT 


GTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


CTG 


CTG 


546 


TCC 


TGT 


TTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 










573 



30 

(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
1^ (D) TOPOLOGY: linear 



wo 96/05315 
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- 162 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosaplens 
(C) INDIVIDUAL ISOLATE: P8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123; 



5 



10 



ATG 


AGO 


ACG 


ACT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AGC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


GAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGA 


TCG 


CAA 


CCT 


CGT 


GGC 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


CAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GCC 


AAT 


GAG 


GGC 


TTG 


273 


6GG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCC 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GGC 


CCC 


429 


CTA 


GGG 


GGC 


GTT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


GTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTG 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCT 


TTG 


CTG 


546 


TCT 


TGT 


CTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 










573 



15 

(2) INFORMATION FOR SEQ ID NO: 124: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL^ SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


GAC 


GAG 


GGC 


ATG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCC 


CGG 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


.CGT 


351 


AAT 


CTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390- 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCT 


CCC 


429 


TTA 


GGG 


GGC 


GTT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAT 


TAC 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGC 


TTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 










573 
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(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNHSS : s ingl e 
5 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

10 



15 



20 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTA 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


ASG 


GGC 


CCC 


AG6 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


ACT 


156 


TCG 


GAG 


CGA 


TCC 


CAG 


CCA 


CGT 


GGG 


AGG 


CGC 


CAG 


CCC 


ATC 


195 


CCC 


AAA 


GAT 


CGG 


CGC 


TCC 


ACT 


GGC 


AAG 


TCC 


TGG 


GGA 


AAA 


234 


CCA 


GGA 


TAT 


CCC 


TGG 


CCC 


CTG 


TAT 


GGG 


AAT 


GAG 


GGA 


CTC 


273 


GGC 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGA 


GGT 


TCC 


CGT 


312 


CCC 


TCC 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CAT 


AGG 


TCG 


CGC 


351 


AAC 


GTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


AGC 


CTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


GTC 


CCC 


GTC 


GTA 


GGC 


GGC 


CCG 


429 


TTG 


GGT 


GGC 


GTC 


GCC 


AGA 


GCT 


CTC 


GCG 


CAT 


GGC 


GTG 


AGA 


468 


GTC 


CTG 


GAG 


GAC 


GGG 


GTT 


AAT 


TAT 


GCA 


ACA 


GGG 


AAC 


TTA 


507 


CCT 


GGT 


TGC 


TCC 


TTT 


TCT 


ATT 


TTC 


TTG 


CTG 


GCC 


CTA 


CTG 


546 


TCC 


TGC 


ATC 


ACC 


ATT 


CCA 


GTC 


TCC 


GCT 










573 



(2) INI^ORMATION FOR SEQ ID NO: 126: ,A 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNSSS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE:. 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: USIO 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 



35 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACT 


AAC 


CGT 


CGC 


CCA 


CAA 


GAC 


GTT 


AAG 


TTT 


CCG 


GGC 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTA 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCC 


CAG 


CCA 


CGT 


GGG 


AGG 


CGC 


CAG 


CCC 


ATC 


195 


CCC 


AAA 


GAT 


CGG 


CGC 


CCC 


ACT 


GGC 


AAG 


TCC 


TGG 


GGA 


AAA 


234 


CCA 


GGA 


TAC 


CCT 


TGG 


CCC 


CTA 


TAT 


GGG 


AAT 


GAG 


GGA 


CTC 


273 
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GGC 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGA 


GGT 


TCC 


CGT 


312 


CCC 


TCT 


TGG 


GGC 


CCC 


ACT 


GAT 


CCC 


CGG 


CAT 


AGG 


TCG 


CGC 


351. 


AAC 


GTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGC 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATC 


CCC 


GTC 


GTG 


GGC 


GCT 


CCG 


429 


CTT 


GGt 


GGC 


GTC 


GCC 


AGA 


GCT 


CTC 


GCG 


GAT 


GGC 


GTG 


AGG 


468 


GTC 


CTG 


GAG 


GAC 


GGG 


GTT 


AAT 


TAT 


GCA 


ACA 


GGG 


AAC 


TTA 


507 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


TTG 


CTG 


GCC 


TTA 


CTG 


546 


TCC 


TGC 


ATC 


ACC 


ATT 


CCA 


GTC 


TCT 


GCT 










573 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 

15 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 



20 



ATG 


AGC 


ACA 


AAT 


CCA 


AAA 


CCC 


CAA 


AGA 


AAA 


ACC 


ATA 


AGA 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTA 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


ACG 


ACA 


AGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCC 


CAG 


CCA 


CGT 


GGG 


AGG 


CGC 


CAG 


CCC ATC 


195 


CCC 


AAA 


GAT 


CGG 


CGC 


TCC 


ACT 


GGC 


AAG 


TCC 


TGG 


GGA 


AAA 


234 


CCA 


GGA 


TAC 


CCC 


TGG 


CCT 


CTA 


TAT 


GGG 


AAT 


GAG 


GGA 


CTC 


273 


GGC 


TGG 


GCG 


GGA TGG 


CTC 


CTG 


TCC 


CCC 


CGA 


GGT 


TCG^^CGT 


312 


CCC 


TCT 


TGG 


GGC 


CCC 


AGT 


GAC 


CCC 


CGG 


CAT 


AGG 


TCG CGC 


351 


AAC 


GTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGC 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCC 


GTC 


GTA 


GGC 


GCC 


CCG 


429 


CTT 


GGT 


GGC 


GTT 


GCC 


AGA 


GCT 


CTC 


GCG 


CAC 


GGC 


GTG 


AGA 


468 


GTC 


CTG 


GAG 


GAC 


GGG 


GTT 


AAT 


TAT 


GCA 


ACA 


GGG 


AAC 


CTA 


507 


CCT 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


TTG 


CTG 


GCC 


CTA 


CTG 


546 


TCC 


TGC 


ATC 


ACC 


ACT 


CCG 


GCC 


TCT 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 



wo 96/05315 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 



ATG 


AGC 


ACA 


ATT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACT 


AAC 


CGT 


CGC 


CCA 


CAA 


GAC 


GTT 


AAG 


TTT 


CCG 


GGC 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTA 


TAC 


TTG 


CTG 


CCG 


CGC 


117 


AG6 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCC 


CAG 


CCT 


CGT 


GGA 


AGG 


CGC 


CAG 


CCC 


ATC 


195 


CCT 


AAA 


GAT 


CGG 


CGC 


TCC 


ACT 


GGC 


AAG 


TCC 


TGG 


GGA 


AAA 


234 


CCA 


GGA 


TAC 


CCC 


TGG 


CCC 


CTG 


TAT 


GGG 


AAT 


GAG 


GGG 


CTC 


273 


GGC 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGA 


GGT 


TCT 


CGT 


312 


CCC 


TCT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CAT 


AGG 


TCG 


CGC 


351 


AAT 


GTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGC 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCC 


GTC 


GTA 


GGC 


GCC 


CCG 


429 


CTT 


GGT 


GGT 


GTC 


GCC 


AGA 


GCT 


CTT 


GCG 


CAT 


GGC 


GTG 


AGA 


468 


GTC 


CTG 


GAG 


GAC 


GGA 


GTT 


AAT 


TAT 


GCA 


ACA 


GGT 


AAC 


TTA 


507 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


TTG 


CTA 


GCC 


CTG 


CTG 


546 


TCC 


TGC 


ATC 


ACT 


ATT 


CCG 


GTT 


TCA 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 129: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
2^ (C) INDIVIDUAL ISOLATE: T8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: f'\ 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACA 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


CTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG ACT 


156 


TCC 


GAG 


CGA 


TCC 


CAG 


CCG 


CGT 


GGG 


AGA 


CGC 


CAG 


CCC 


ATC 


195 


CCG 


AAA 


GAT 


CGG 


CGC 


TCC 


ACC 


GGC 


AAG 


TCC 


TGG 


GGA AAA 


234 


CCA 


GGA TAT 


CCT 


TGG 


CCT 


CTT 


TAC 


GGA 


AAC 


GAG 


GGC 


TGC 


273 


GGT 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGG 


TCT 


CGT 


312 


CCT 


ACT 


TGG 


GGC 


CCC 


ACT 


GAC 


CCC 


CGG 


CAT 


AGA 


TCA 


CGT 


351 


AAT 


TTG 


GGC 


AGA 


GTC 


ATC 


GAT 


ACC 


ATT 


ACA 


TGT 


GGT 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCT 


GTC 


GTT 


GGC 


GCC 


CCG 


429 


GTC 


GGA 


GGC 


GTC 


GCC 


AGA 


GCT 


CTG 


GCA 


CAT 


GGT 


GTT 


AGG 


468 


GTC 


CTG 


6AA 


GAC 


GGG 


ATA 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCT 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


TTG 


CTT 


GCT 


CTT 


CTG 


546 


TCA 


TGC 


TTC 


ACA 


GTG 


CCA 


GTG 


TCT 


GCA 
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INFORMATION FOR SEQ ID NO: 130: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: USl 



39 
78 
117 
156 
195 
234 
273 
312 
351 
390 

15 GCC GAC CTC ATG GGG TAG ATC CCT GTC GTT GGC GCC CCG 429 

468 
507 
546 
573 



10 







(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: i: 


ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACA 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


GGC 


GGT 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


CTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


ACT 


TCC 


GAG 


CGA 


TCC 


CAG 


CCG 


CGT 


GGG 


AGA 


CGC 


CAG 


CCC 


ATC 


CCG 


AAA 


GAT 


CGG 


CGC 


TCC 


ACC 


GGC 


AAG 


TCC 


TGG 


GGA 


AAG 


CCA 


GGA 


TAT 


CCT 


TGG 


CCT 


CTG 


TAC 


GGA 


AAC 


GAG 


GGC 


TGC 


GGC 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGG 


TCT 


CGT 


CCT 


ACT 


TGG 


GGC 


CCC 


ACT 


GAC 


CCC 


CGG 


CAC 


AGA 


TCA 


CGT 


AAC 


TTG 


GGC 


AAG 


GTC 


ATC 


GAT 


ACC 


ATT 


ACG 


TGT 


GGT 


TTT 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCT 


GTC 


GTT 


GGC 


GCC 


CCG 


GTC 


GGA 


GGC 


GTC 


GCC 


AGA 


GCT 


CTG 


GCA 


CAC 


GGT 


GTT 


AGG 


GTC 


CTG 


GAA 


GAC 


GGG 


ATA 


AAT 


TAC 


GCA 


ACA 


GGG 


AAT 


CTG 


CCT 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


TTA 


CTT 


GCT 


CTT 


CTG 


TCG 


TGC 


GCC 


ACG 


GTG 


CCG 


GTG 


TCT 


GCA 











(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

1^ (A) LENGTH: 573 base pairs^ 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DKll 



30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA AAA ACC 


AAA 


AGA 


39 


AAT 


ACA 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG TTC 


CCG 


GGT 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA GTT 


TAC 


TTG CTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


ACG 


ACA AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGA 


TCC 


CAG 


CCG 


CGT 


GGG 


AGA 


CGC CAG 


CCC 


ATC 


195 


CCG 


AAA 


GAT 


CGG 


CGC 


TCC 


ACC 


GGC 


AAG 


CCC TGG 


GGA 


AAG 


234 


CCA 


GGA 


TAT 


CCT 


TGG 


CCC 


CTG 


TAT 


GGA AAC GAG 


GGC 


TGC 


273 


GGC 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC GGG 


TCT 


CAT 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


ACT GAC 


CCC 


CGG 


CAT AAA 


TCA 


CGC 


351 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAC 


ACC 


ATT 


ACG TGT 


GGT 


TTT 


390 


GCC 


GAG 


CTC 


ATG 


GGG 


TAC 


ATC 


CCT 


GTC 


GTC GGC 


GCC 


CCG 


429 
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GTC GGA GGC GTC GCC 
GTC CTG GAA GAC GGG 
CCT GGT TGC TCT TTT 
TCA TGC TGC ACA GTG 



AGA GCT CTG GCA CAC 
ATA AAT TAC GCA ACA 
TCT ATC TTC TTA CTT 
CCA GTG TCT GCG 



GGT GTT AGA 468 
GGG AAT CTG 507 
GCT CTT CTG 546 

573 



10 



IS 



20 



25 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 132: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132; 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA AGA 


39 


AAT 


ACA 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


CTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGA 


TCC 


CAG 


CCG 


CGT 


GGG 


AGA 


CGC 


CAG 


CCC 


ATC 


195 


CCG 


AAA 


GAT 


CGG 


CGC 


TCC 


ACC 


GGC 


AAG 


TCC 


TGG 


GGA 


AAG 


234 


CCA 


GGA 


TAT 


CCT 


TGG 


CCC 


CTG 


TAT 


GGA 


AAC 


GAG 


GGC 


TGC 


273 


GGC 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGG 


TCT 


CAT 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


ACT 


GAC 


CCC 


CGG 


CAT 


AGA 


TCA 


CGC 


351 


AAT 


TTG 


GGC 


AAA GTC 


ATC 


GAC 


ACC 


ATT 


ACG 


TGT 


GGT 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCT 


GTC 


GTT 


GGC 


GCC 


CCG 


429 


GTC 


GGA 


GGC 


GTC 


GCC 


AGA 


GCT 


^CTG 


GCA 


CAC 


GGT 


GTT 


AGA 


468 


GTC 


CTG 


GAA 


GAC 


GGG 


ATA 


AAT 


TAC 


GCA 


ACA 


GGG 


AAT 


CTG 


507 


CCT 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


TTA 


CTT 


GCT 


CTT 


CTG 


546 


TCG 


TGC 


TTC 


ACA 


GTG 


CCA 


GTG 


TCT 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(Dj TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

ATG AGC ACA AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 39 
AAC ACA AAC CGC CGC CCA CAG GAC GTT AAG TTC CCG GGT 78 
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GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


CTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


TCT 


156 


TCC 


GAG 


CGA 


TCC 


CAG 


CCG 


CGT 


GGG 


AGG 


CGC 


CAG 


CCC 


ATC 


1^5 


CCG 


AAA 


GAT 


CGG 


CGC 


TCC 


ACC 


GGC 


AAG 


TCC 


TGG 


GGA 


AAA 


234 


CCG 


G6A 


TAT 


CCT 


TGG 


CCC 


CTG 


TAT 


GGA 


AAC 


GAG 


GGC 


TGC 


273 


GGC 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGG 


TCT 


CGT 


312 


CCT 


ACT 


TGG 


GGC 


CCC 


ACT 


GAC 


CCC 


CGG 


CAT 


AGA 


TCA 


CGC 


351 


AAT 


TTG 


GGC 


AAA 


GTC 


ATC 


GAC 


ACC 


ATT 


ACG 


TGT 


GGT 


TTT 


390 


GCC 


GAG 


CTC 


ATG 


GGG 


TAC 


ATC 


CCT 


GTC 


GTT 


GGC 


GCC 


CCG 


429 


GTT 


GGA 


GGC 


GTC 


GCC 


AGA 


GCT 


CTG 


GCA 


CAC 


GGT 


GTT AGG 


468 


GTC 


CTG 


GAA 


GAC 


GGG 


ATA 


AAT 


TAC 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCT 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


TTG 


CTT 


GCT 


CTT 


CTG 


546 


TCG 


TGC 


TGC 


ACA 


GTG 


CCA 


GTG 


TCT 


GCG 










573 
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(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
15 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hcmosapiens 
(C) INDIVIDUAL ISOLATE: 883 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 



25 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACT 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT/ GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTA 


TAC 


TTG 


CTG 


CCG 


CGC 


117 


AGG 


GGC 


CCG 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAA 


ACT 


156 


TCC 


GAA 


CGG 


TCC 


CAG 


CCA 


CGT 


GGG 


AGG 


CGC 


CAG 


CCC 


ATC 


195 


CCT 


AAA 


GAT 


CGG 


CGC 


ACC 


ACT 


GGC 


AAG 


TCC 


TGG 


GGA 


AGG 


234 


CCA 


GGA 


TAC 


CCT 


TGG 


CCC 


CTG 


TAT 


GGG 


AAT 


GAG 


GGC 


CTC 


273 


GGC 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGT 


TCT 


CGC 


312 


CCT 


TCA 


TGG 


GGC 


CCC 


ACC 


GAC 


CCC 


CGG 


CAT 


AAA 


TCG 


CGC 


351 


AAC 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGT 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATA 


CCC 


GTC 


GTT 


GGC 


GCT 


CCC 


429 


GTT 


GGC 


GGC 


GTT 


GCC 


AGA 


GCC 


CTC 


GCC 


CAT 


GGG 


GTG 


AGG 


468 


GTT 


CTG 


GAG 


GAC 


GGG 


ATA 


AAT 


TAT 


GCA 


ACG 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


CTC 


TTG 


GCC 


CTC 


TTG 


546 


TCT 


TGC 


ATC 


TCT 


GTG 


CCA 


GTT 


TCC 


GCC 










573 
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(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: ' nucleic acid 

(C) STRANDEDNESS : singl 

(D) TOPOLOGY: linear 



W 96/05315 



PCTAJS95/10398 



10 



15 



30 



169 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HKIO 





(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 135: 




ATG 


A6C 


ACA 


CTT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


ATC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGA 


CAG ATC 


GTT 


GGT 


GGA 


GTA 


TAC 


GTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCA 


CGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


CGT 


AAA 


ACT 


156 


TCT 


6AA 


CGG 


TCG 


CAG 


CCT 


CGC 


GGA 


CGA 


CGA 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGT 


CGG 


AGC 


GAA 


GGC 


CGG 


TCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGT 


AAC 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCA 


CGC 


GGC 


TCC 


CGT 


312 


CCA 


TCT 


TGG 


GGC 


CCA 


AAC 


GAC 


CCC 


CGG 


CGA 


CGG 


TCC 


CGC 


351 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTT 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCT 


CCC 


429 


GTA 


GGA 


GGC 


GTC 


GCA 


AGA 


GCC 


CTC 


GCG 


CAT 


GGC 


GTG 


AGG 


468 


GCC 


CTT 


GAA 


GAC 


GGG 


ATA 


AAT 


TTC 


GCA 


ACA 


GGG 


AAC 


TTG 


507 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


CTT 


CTT 


GCT 


CTG 


TTC 


546 


TCT 


TGC 


TTA 


ATT 


CAT 


CCA 


GCA 


GCT 


AGT 
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(2) INFORMATION FOR SEQ ID NO: 136: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

1^ 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 



35 



ATG 


AGC 


ACA 


CTT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


ATC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGA 


CAG 


ATC 


GTT 


GGT 


GGA 


GTA 


TAC 


GTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCA 


CGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


CGT 


AAA 


ACT 


156 


TCT 


GAA 


CGG 


TCA 


CAG 


CCT 


CGC 


GGA 


CGA 


CGA 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGT 


CGG 


AGC 


GAA 


GGC 


CGG 


TCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGT 


AAT 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCA 


CGC 


GGC 


TCC 


CGT 


312 


CCA 


TCT 


TGG 


GGC 


CCA 


AAC 


GAC 


CCC 


CGG 


CGG 


AGG 


TCC 


CGC 


351 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTT 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCT 


CCC 


429 


GTA 


GGA 


GGC 


GTC 


GCA 


AGA 


GCC 


CTC 


GCG 


CAT 


GGC 


GTG 


AGG 


468 


GCC 


CTT 


GAA 


GAC 


GGG 


ATA 


AAT 


TTT 


GCA 


ACA 


GGG 


AAC 


TTG 


507 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


CTT 


CTT 


GCT 


CTG 


TTC 


546 


TCC 


TGC 


TTA 


GTT 


CAT 


CCT 


GCA 


GCT 


AGT 
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W096A)5315 



PCTAJS95/10398 



170 - 



[2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 



10 



15 







(xi) 




SEQUENCE DESCRIPTION: SEQ : 


ID NO: 137: 




ATG 


AGC 


ACA 


CTT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


ATC 


CGT 


CGC 


CCA 


CAG 


GAC 


ATC 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGA 


CAG 


ATC 


GTT 


GGT 


GGA 


GTA 


TAC 


GTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCA 


CGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


CGT 


AAA 


ACT 


156 


TCT 


GAA 


CGG 


TCA 


CAG 


CCT 


CGC 


GGA 


CGG 


CGA 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGT 


CGG 


AGC 


GAA 


GGC 


CGA 


TCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGT 


AAC 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCA 


CGC 


GGC 


TCC 


CGT 


312 


CCA 


TCT 


TGG 


GGC 


CCA 


AAT 


GAC 


CCC 


CGG 


CGG 


AGG 


TCC 


CGC 


351 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTT 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCT 


CCC 


429 


GTA 


GGA 


GGC 


GTC 


GCA 


AGA 


GCC 


CTC 


GCG 


CAT 


GGC 


GTG 


AGG 


468 


GCC 


CTT 


GAA 


GAC 


GGG 


ATA 


AAT 


TTT 


GCA 


ACA 


GGG 


AAC 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


CTT 


CTT 


GCC 


CTG 


TTC 


546 


TCT 


TGC 


TTA 


ATT 


CAT 


CCA 


GCA 


GCT 


AGT 
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20 

(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

ATG AGC ACA CTT CCT AAA CCT CAA AGA AAA ACC AAA AGA 39 ' 

AAC ACC ATC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGT 78 

GGC GGA CAG ATC GTT GGT GGA GTA TAC GTG TTG CCG CGC 117 

AGG GGC CCA CGA TTG GGT GTG CGC GCG ACG CGT AAA ACT 156- 

TCT GAA CGG TCA CAG CCT CGC GGA CGG CGA CAG CCT ATC 195 

35 CCC AAG GCG CGT CGG AGC GAA GGC CGG TCC TGG GCT CAG 234 

CCT GGG TAC CCT TGG CCC CTC TAT GGT AAC GAG GGC TGC 273 



WOW/05315 



PCT/US95/10398 
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o 



GGG 


TGG 


6CA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCA 


CGC 


GGC 


TCC 


CGT 


312 


CCA 


TCT 


TGG 


GGC 


CCA AAC 


GAC 


CCC 


CGG 


CGG 


AGG 


TCC 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCT 


CCT 


429 


GTA 


GGG 


GGC 


GTC 


GCA 


AGA 


GCC 


CTC 


GCG 


CAT 


GGC 


GTG 


AGG 


468 


GCC 


CTT 


GAA 


GAC 


GGG 


ATA 


AAT 


TTC 


GCA 


ACA 


GGG 


AAC 


TTG 


507 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


CTT 


CTT 


GCT 


CTG 


TTC 


546 


TCT 


TGC 


CTA 


ATT 


CAT 


CCA 


GCA 


GCT 


AGT 
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(2) INFORMATION FOR SEQ ID NO: 139: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCC 


ATG 


GAC 


GTA 


AAG 


TTC 


CCG 


GGT 


78 


GGT 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


C6A 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGC 


AGG 


CGT 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGC 


CAG 


CCA 


GAG 


GGC 


AGA 


TCC 


TGG 


GCG 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCA 


GGG TGG 


CTC 


CTG 


TCT 


CCT 


CGC 


GGC 


TCT 


CGG^^ 


312 


CCA 


TCT 


TGG 


GGC 


CCA 


AAT 


GAT 


CCC 


CGG 


CGG 


AGA 


TCG 


CGC 


351 


AAT 


CTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTG 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATC 


CCG 


ATC 


GTG 


GGC 


GCC 


CCC 


429 


GTG 


GGG 


GGC 


GTC 


GCC 


AGG 


GCT 


CTG 


GCG 


CAT 


GGC 


GTC 


AGG 


468 


GCT 


GTG 


GAG 


GAC 


GGG 


ATT 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTiS 


GCA 


CTT 


CTT 


546 


TCG 


TGC 


CTC 


ACT 


GTT 


CCA 


GCG 


TCG 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: . 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: 28 



wo 96/05315 



PCTAJS95/10398 



- 172 - 

o 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCT 


ATG 


GAT 


GTA 


AAA 


TTC 


CCA 


GGC 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AG6 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGC 


AGG 


CGT 


CAG 


CCT 


ATC 


195 


ccc 


AAG 


GCA 


CGT 


CGG 


TCC 


GAG 


GGT 


AGG 


TCC 


TGG 


GCT 


CAG 


234 


ccc 


GGG 


TAC 


CCA 


TGG 


CCT 


CTT 


TAC 


GGT 


AAT 


GAA 


GGC 


TGT 


273 


GGG 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGC 


TCT 


CGA 


312 


CCG 


TCT 


TGG 


GGC 


CCA 


AAT 


GAT 


CCC 


CGG 


CGG 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATC 


CCG 


CTC 


GTG 


GGC 


GCC 


CCA 


429 


GTA 


GGA 


GGC 


GTC 


GCC 


AGA 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


AGG 


468 


GCT 


GTG 


GAG 


GAC 


GGG 


ATC 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCA 


CTT 


CTC 


546 


TCG 


TGC 


CTA 


ACC 


GTC 


CCA 


GCG 


TCT 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 141: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE: Zl 

'\ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: /A 



25 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCC 


ATG 


GAT 


GTG 


AAA 


TTC 


CCG 


GGC 


•78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


CTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


CGG 


TTG 


GGT 


GTG 


CGC 


GCA 


GCT 


CGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCA 


CAA 


CCT 


CGT 


GGC 


AGG 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGC 


CGG 


TCC 


GAG 


GGC 


AGG 


TCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GGC 


AAT 


GAG 


GGC 


TGT 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGT 


TCC 


AGG 


312 


CCG 


TCT 


TGG 


GGC 


CCC 


AAT 


GAT 


CCC 


CGG 


CGT 


AGG 


TCC 


CGT 


351 


AAT 


CTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTG 


ACG 


TGT 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATT 


CCG 


CTC 


GTA 


GGC 


GCC 


CCT 


429 


GTG 


GGT 


GGC 


GTC 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


AGG 


468 


GCC 


GTG 


GAG 


GAC 


GGA 


ATT 


AAC 


TAC 


GCA 


ACA 


GGG 


AAC .CTT 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


CTT 


CTT 


GCA 


CTT 


CTC 


546" 


TCG 


TGC 


CTG 


ACA 


ACA 


CCA 


GCA 


TCT 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 



WO96/0S31S 



PCrAJS95/103M 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
5 (C) INDIVIDUAL ISOLATE: 25 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 



10 



15 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCC 


ATG 


GAT 


GTA 


AAA 


TTC 


CCG 


GGT 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCG 


CAA 


CCT 


CGC 


GGC 


AGG 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


CAG 


GCA 


CGT 


CGG 


TCC 


GAG 


GGC 


AGG 


TCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCT 


CTT 


TAT 


GGC 


AAT 


GAG 


GGC 


TGT 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGA 


TCT 


CGG 


312 


CCA 


TCT 


TGG 


GGC 


CAA 


AAT 


GAT 


CCC 


CGG 


CGT 


AGG 


TCC 


CGC 


351 


AAT 


CTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTG 


ACG 


TGT 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCA 


429 


GTA 


GGT 


GGC 


GTC 


GCC 


AGG 


GCC 


TTG 


GCG 


CAT 


GGC 


GTC 


AGG 


468 


GCC 


CTG 


GAG 


GAC 


GGA 


ATC 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


507 


CCT 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


CTA 


CTT 


GCA 


CTT 


TTC 


546 


TCG 


TGC 


TTG 


ACA 


ACA 


CCG 


GCA 


TCC 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 143: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 
-r\ (B) TYPE: nucleic acid f'\ 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

2c (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z6 



30 



35 







(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 143: 




ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCC 


ATG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


78 


GGT 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGG 


AGA 


CGC 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCA 


CGT 


CGA 


TCT 


GAG 


GGA 


AGG 


TCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAT 


CCA 


TGG 


CCT 


CTT 


TAC 


GGT 


AAT 


GAG 


GGT 


TGC 


273 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGA 


312 


CCG 


TCT 


TGG 


GGT 


CCA 


AAT 


GAT 


CCC 


CGG 


CGA 


AGG 


TCC 


CGC 


351 


AAC 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACT- 


CTA 


ACT 


TGC 


GGT 


TTC 


390 


GCC 


GAT 


CTC 


ATG 


GGA 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GCC 


CCC 


429 


GTG 


GGC 


GGC 


GTC 


GCC 


AGG 


GCC 


CTG 


GCA 


CAT 


GGT 


GTT 


AGG 


468 



W 96/05315 
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15 



20 
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GCT GTG GAG GAC GGG ATC AAT TAT GCA ACA GGG AAT CTT 507 
CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCA CTT CTT 546 
TCG TGC CTA ACT GTT CCC ACC TCG GCC 57.3 



(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144; 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCC 


ATG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGA 


TTG 


GGT 


GTG 


CGC 


ACA 


ACT 


AGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGG 


AGA 


CGT 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCA 


CGT 


CGA 


TCT 


GAG 


GGA 


AGG 


TCC 


TGG 


GCT 


CAA 


234 


CCC 


GGG 


TAC 


CCA 


TGG 


CCT 


CTT 


TAC 


GGT 


AAC 


GAG 


GGT 


TGC 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


TTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGA 


312 


CCG 


TCT 


TGG 


GGC 


CCA 


AAT 


GAT 


CCC 


CGG 


CGA 


AGG TCC 


CGC 


351 


AAC 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACC 


TGC 


GGC 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GCC 


CCC 


429 


GTG 


GGC 


GGC 


GTC 


GCC 


AGG 


GCC 


CTA 


GCG 


CAT 


GGC 


GTT 


AGG 


468 


GCT 


CTG 


GAG 


GAC 


GGG 


AT^ 


AAT 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 


/\507 


CCC 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


CTC 


TTG 


GCA 


CTT 


CTT 


546 


TCG 


TGC 


CTG 


ACT 


GTT 


CCC 


GCC 


TCG 


GCC 










573 



25 (2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 



35 



ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 39 
AAC ACC AAC CGC CGC CCA ATG GAC GTT AAG TTC CCG GGT 78 
GGC GGC CAG ATC GTT GGC GGA GTT TAC TTG TTG CCG CGC 117 



wo 96/05315 
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- 175 - 



AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCG 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGG 


AGG 


CGC 


CAG 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGC 


CAA 


CTC 


GAG 


GGT 


AGG 


TCC 


TGG 


GCT 


CAG 


234 


CCT 


GGG 


TAT 


CCT 


TGG 


CCC 


CTT 


TAC 


GGC 


AAT 


GAG 


GGC 


TGC 


273 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGG 


312 


CCG 


TCT 


TGG 


GGC 


CCG 


AAT 


GAT 


CCC 


CGG 


CGG 


AGG 


TCC 


CGC 


351 


AAC 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACT 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATC 


CCG 


GTC 


GTA 


GGC 


GCC 


CCC 


429 


GTG 


GGT 


GGC 


GTC 


GCC 


AGA 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


AGG 


468 


CTT 


CTG 


GAG 


GAC 


GGG 


GTC 


AAT 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCA 


CTG 


CTC 


546 


TCG 


TGC 


CTG 


ACT 


GTT 


CCC 


GCT 


TCG 


GCC 










573 



(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

( A ) ORGANI SM : homos ap i ens 
(C) INDIVIDUAL ISOLATE: SA4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTC 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT, CGG AAG ACT 


156 


TCA 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGG 


CGG 


CGC 


CAG 


CCT 


ATT 


195 


CCC 


AAG 


GCG 


CGC 


CAA 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA GGC 


TCT 


CGG 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGA AAG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


429 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCC 


CTT 


GGA 


CAT 


GGT 


GTG 


AGG 


468 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAT 


GCA 


ACG 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTC 


546 


TCG 


TGC 


CTG 


ACC 


GTC 


CCG 


GCC 


TCT 


GCA 










573 



30 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



wo 96/05315 



PCT/US9S/10398 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 



10 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


A6A 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


156 


TCA 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGG 


CGG 


CGC 


CAG 


CCT 


ATT 


195 


CCC 


AAG 


GCG 


CGC 


CAA 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC 


TCT 


CGG 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGA 


AAA 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


429 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCC 


CTC 


GCA 


CAT 


GGT 


GTG 


AGG 


468 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTC 


546 


TCG 


TGC 


TTG 


ACC 


GTC 


CCA 


GCC 


TCT 


GCA 










573 



15 

(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

^ ^ ( vi ) /ORIGINAL SOURCE : r \ 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA7 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC GTT 


GGT 


GGA 


GIT 


TAC 


TTG 


TTG 


CCG 


CGC 


- 117 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


156 


TCA 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGG 


CGG 


CGC 


CAG 


CCT 


ATT 


195 


CCC 


AAG 


GCG 


CGC 


CAA 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC 


TCT 


CGG 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGA 


AAG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAC 


ACC 


CTA 


ACA 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


429 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCT 


CTC 


GCA 


CAC 


GGT 


GTG 


AGG 


468 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAT 


TAC 


GCA 


ACA 


GGG 


AAT 


CTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTC 


546 


TCG 


TGC 


CTG 


ACC 


GTC 


CCA 


GCC 


TCC 


GCA 










573 



wo 96I0531S 
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(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 



10 



15 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


AAC 


CTC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


156 


TCG 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGG 


CGG 


CGC 


CAG 


CCT ATT 


195 


CCC 


AAG 


GCG 


CGC 


CAA 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC 


TCT 


CGG 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGG 


AAG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


429 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCT 


CTC 


GCA 


CAC 


GGT 


GTG 


AGG 


468 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAC 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTT 


546 


TCC 


TGT 


CTG 


ATC 


ATC 


CCG 


GCC 


TCT 


GCA 










573 



r\(2) INFORMATION FOR SEQ ID NO: /ISO: fj\ 

(i) SEQUENCE CHTUIACTERISTICS : 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
" (D> TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

30 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 39 

AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC 78 

GGT GGT CAG ATC GTT GGT GGA GTT TAC TTG TTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACT CGG AAG ACT 156 

TCA GAA CGG TCG CAA CCC CGT GGA CGG CGC CAG CCT ATT 195 

CCC AAG GCT CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA 234 

CCC GGG TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC 273 

GAG TGG GCA GGG TGG TTG CTC TCC CCC CGA GGC TCT CGG 312 



wo 96/05315 
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CCT AGT TGG 

AAT TTG GGT 

GCC GAT CTC 

GTT GGG GGC 

GTT CTT GAG 

CCC GGT TGC 

5 TCA TGC CTG 



GGC CCC AAC 
AAG GTC ATC 
ATG GGG TAC 
GTC GCA AGG 
GAC GGG GTA 
TCT TTC TCT 
ACC GTC CCG 



- 178 - 



GAC CCC CGG 
GAT ACC CTA 
ATC CCG CTC 
GCT CTC GCA 
AAC TAC GCA 
ATC TTT ATC 
GCC TCT GCA 



CGG AAA TCG 
ACG TGC GGA 
GTA GGC GGC 
CAT GGT GTG 
ACA GGG AAT 
CTT GCA CTT 



CGC 351 

TTC 390 

CCC 42S 

AGG 468 

TTA 507 

CTT 546 
573 



10 



15 



25 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 



39 
78 
117 
156 

195 

20 CCC AAG GCG CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA 234 

273 
312 
351 
390 
429 
468 
507 
546 
573 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


GCA 


ACT 


CGG 


AAG 


ACT 


TCA 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGA 


CGG 


CGT 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGC 


CAG 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAT 


GCC 


AAT 


GAG 


GGC 


CTC 


GGG 


TGG 


,G^CA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC\ 


TCT 


CGG 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGG 


AAA 


TCG 


CGC 


AAC 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTG 


ACG 


TGC 


GGA 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCT 


CTC 


GCA 


CAC 


GGT 


GTG 


AGG 


GTC 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTA 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTT 


TCA 


TGC 


CTG 


ACT 


GTC 


CCG 


ACC 


TCT 


GCC 











(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
2^ (C) INDIVIDUAL ISOLATE: SA6 



WO96/0S315 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 



AT6 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


CAA AGA 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


A66 


GGC 


CCT 


CGT 


ATG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


156 


TCG 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGA 


CGG 


CGT 


CAG 


CCT 


ATT 


195 


CCC 


AAG 


GCG 


CGC 


CAA 


TCC 


GCG 


GGT 


CGG 


TCC 


TGG 


GGT 


CAA 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC 


TCT 


CGG 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGA 


AAA 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


429 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCT 


CTC 


GCA 


CAC 


GGT 


GTG 


AGG 


468 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


GTC 


CTT 


GCA 


CTT 


CTC 


546 


TCG 


TGC 


CTA 


ACC 


GTC 


CCT 


GCC 


TCT 


GCA 










573 



(2) INFORMATION FOR SEQ ID NO: 153: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE: SAll 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 



25 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


156 


TCA 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGG 


CGG 


CGT 


CAG 


CCT 


ATT 


195 


CCC 


AAG 


GCG 


CGC 


CAA 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


TTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


CTG 


CTC 


TCC 


CCT 


CGA 


GGC 


TCT 


CGG 


312 


CCT 


AAC 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGA 


AGA 


TCG 


CGC 


351 


AAT 


TTG 


GGC 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


429 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCC 


CTC 


GCA 


CAC 


GGT 


GTG 


AGA 


468 


GCT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAT 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCC 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTC 


546 


TCG 


TGC 


TTG 


ACC 


GTC 


CCG 


GCC 


ACT 


GCA 










573 



35 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

5 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 



10 



ATG 


AGC 


ACA 


CTT 


CCA 


AAA 


CCC 


CAA 


AGA 


AAA 


ACC 


AAA AGA 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


ACG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


78 


GGC 


GGT 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


CGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGA 


AAG 


ACT 


156 


TCC 


GAG 


CGA 


TCC 


CAG 


CCC 


AGA 


GGC 


AGG 


CGC 


CAA 


CCT 


ATA 


195 


CCA 


AAG 


GCG 


CGC 


CAG 


CCC 


CAG 


GGC 


AGG 


CAC 


TGG 


GCT 


CAG 


234 


CCC 


GGA 


TAG 


CCT 


TGG 


CCT 


CTT 


TAT 


GGA 


AAC 


GAG 


GGC 


TGT 


273 


GGG 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGC 


TCC 


CGG 


312 


CCA 


CAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGT 


CGA 


TCC 


CGG 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGT 


GGG 


TTC 


390 


GCC 


GAT 


CTC 


ATG 


GGG 


TAC 


ATT 


CCC 


GTC 


GTG 


GGC 


GCG 


CCT 


429 


TTG 


GGC 


GGC 


GTC 


GCG 


GCT 


GCG 


CTC 


GCA 


CAT 


GGC 


GTG 


AGG 


468 


GCA 


ATC 


GAG 


GAC 


GGG 


ATC 


AAT 


TAT 


GCA 


ACA 


GGG 


AAT 


CTC 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCA 


CTA 


CTC 


546 


TCG 


TGC 


CTC 


ACA AC6 


CCA 


GCT 


TCG 


GCT 










573 



(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

/\ (A) LENGTH: 191 amino acids 

CB) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 
15 10 
30 Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 
15 20 25 

Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 

30 35 40 

Arg Leu Gly Val Arg Ala Pro Arg Lys Thr Ser Glu Arg Ser 

45 50 55 

Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys Ala Arg Arg 
35 60 65 70 
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Pro 


Glu 


Gly 


Arg 


Leu 


Tyr 


Gly 


Asn 


85 








Ser 


Pro 


Arg 


Gly 




100 






Arg 


Arg 


Arg 


Ser 






115 




Thr 


Cys 


Gly 


Phe 








130 


Gly 


Ala 


Pro 


Leu 


Val 


Arg 


Val 


Leu 


155 








Leu 


Pro 


Gly 


Cys 




170 






Ser 


Cys 


Leu 


Thr 



185 



- 181 - 



Thr 


Trp 


Ala 


Gin 


Pro 


75 










Glu 


Gly 
90 


Cys 


Gly 


Trp 


Ser 


Arg 


Pro 
105 


Ser 


Trp 


Arg 


Asn 


Leu 


Gly 
120 


Lys 


Ala 


Asp 


Leu 


Met 


Gly 
135 


Gly 


Gly 


Ala 


Ala 


Arg 


145 










Glu 


Asp 
160 


Gly 


Val 


Asn 


Ser 


Phe 


Ser 
175 


lie 


Phe 


Val 


Pro 


Ala 


Ser 
190 


Ala 



Gly Tyr 


Pro 


Trp 


Pro 


80 








Ala Gly 


Trp 


Leu 


Leu 


95 








Gly Pro 


Thr 


Asp 


Pro 




110 






Val He 


Asp 


Thr 


Leu 






125 




Tyr He 


Pro 


Leu 


Val 








140 


Ala Leu 


Ala 


His 


Gly 


150 






Tyr Ala 


Thr 


Gly 


Asn 


165 








Leu Leu 


Ala 


Leu 


Leu 



180 



(2) INFORMATION FOR SEQ ID NO: 156: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: eunino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE: USll 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 



25 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 






5 










10 










Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 








20 










25 








Gin 


He 
30 


Val 


Gly Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro 


Arg 


Arg 
40 


Gly 


Pro 


Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 








50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 






60 








65 










70 


Pro 


Glu 


Gly 


Arg Thr 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 






75 










80 










Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 






90 










95 








Ser 


Pro 
100 


Arg 


Gly Ser 


Arg 


Pro 
105 


Ser 


Trp 


Gly 


Pro 


Thr 
110 


Asp 


Pro 


Arg 


Arg 


Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


lie 


Asp 


Thr 


Leu 




115 








120 










125 




Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


11 


Pro 


Leu 


Val 




130 








135 










140 
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Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr Val Pro Ala Ser Ala 
185 190 



10 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157; 



Met 


Ser Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 




5 








10 










Thr 


Asn Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 






20 










25 








Gin 


He Val 
30 


Gly Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro 


Arg 


Arg 
40 


Gly 


Pro 


Arg 


Leu Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 


45 








50 










55 




Gin 


Pro Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 




60 








65 










70 


Pro 


Glu Gly 


Arg Thr 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 




75 










80 










Leu 


Tyr Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 




90 










95 








Ser 


Pro Arg 


Gly Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 




105 










110 






Arg 


Arg Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


115 








120 










125 




Thr 


Cys Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 








135 










140 


Gly 


Ala Pro 


Leu Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 




145 










150 










Val 


Arg Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 




160 










165 








Leu 


Pro Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 




175 










180 






Ser 


Cys Leu 
185 


Thr Val 


Pro 


Ala 


Ser 
190 


Ala 













wo 96/05315 



PCTAJS95/10398 



- 183 - 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SWl 



10 



15 



25 







SEQUENCE DESCRIPTION: SEQ ID NO: 158: 




Ser Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr Lys 


Arg Asn 


Thr 


Asn Aro 


5 

Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro Gly 


Gly Gly 


15 




20 










25 




Gin 


He Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg Arg 


Gly Pro 




30 




35 








40 




Arg 


Leu Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser Glu 


Arg Ser 


45 








50 








55 


Gin 


Pro Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys Ala 


Arg Arg 




60 








65 






70 


Pro 


Glu Gly 


Arg Thr 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr Pro 


Trp Pro 




75 










80 






Leu 


Tyr Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly Trp 


Leu Leu 


85 




90 










95 




Ser 


Pro Arg 


Gly Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro Thr 


Asp Pro 




100 






105 








110 


/Thr Leu 


Arg 


Arg Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He Asp 


115 








120 








125 


Thr 


Cys Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He Pro 


Leu Val 




130 








135 






140 


Gly 


Ala Pro 


Leu Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu Ala 


His Gly 




145 










150 






Val 


Arg Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala Thr 


Gly Asn 


155 




160 










165 




Leu 


Pro Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu Ala 


Leu Leu 




170 




175 








180 




Ser 


Cys Leu 


Thr Val 


Pro 


Ala 


Ser 


Ala 










185 








190 











30 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 
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15 



20 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S18 







(xi) 




SEQUENCE DESCRIPTION: SEQ : 


ED NO: 


159 : 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


* 1. ^ 


Gin 


AJT^ 




Th"r 




A T"fT 

Arg 


Asn 


1 








5 










10 










Thr 


Asn 


Arg 


Arg 


Xr i W 






vox 


T .vra 
JLjjr St 




ir X. \j 


vjx y 


vaxy 


Gly 


15 










20 










25 








Gin 


He 


Val 


Gly 


Jr 


V 






T .oil 




Arg 


Arg 


\jxy 


Pro 




30 










35 
















Arg 


Leu 


Gly Val 


Arg 


AX a 


Th-r 


Arg 


T .wa 
Xiy o 


X 111. 


oex 


vjXU 


Arg 


Ser 






45 




















DO 




Gin 


Pro 


Arg Gly 


- 

Arg 


Arg 


m n 


Pro 


XX e 


pro 


Lys 


AX a 


Arg 


Arg 








60 










o o 










70 


Pro 


Glu 


Gly Arg 


TVi-r 


irp 


AX a 


V7X 11 


Pro 


*jxy 


Tyr 


fro 


Trp 


Pro 










75 










80 










Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 










140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala' 


Thr 


Gly 


Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 








175 










180 






Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Ser 


Ala 














185 










190 















25 (2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unlcnown 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 



35 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 
1 5 10 
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10 



Thr 


Asn Arg Arg Pro Gin 


Asp Val Lys Phe 


Pro Gly 


Gly 


Gly 


15 


20 




25 






Gin 


He Val Gly Gly Val 


Tyr Leu Leu Pro 


Arg Arg 


Gly 


Pro 




30 


35 


40 






Arg 


Leu Gly Val Arg Ala 


Thr Arg Lys Thr 


Ser Glu 


Arg 


Ser 




45 


50 




55 




Gin 


Pro Arg Gly Arg Arg 


Gin Pro He Pro 


Lys Ala 


Arq 


Arq 




60 


65 






70 


Pro 


Glu Gly Arg Thr Trp 


Ala Gin Pro Gly 


Tyr Pro 


Trp 


Pro 




75 


80 








Leu 


Tyr Gly Asn Glu Gly 


Cys Gly Trp Ala 


Gly Trp 


Leu 


Leu 


85 


90 




95 






Ser 


Pro Arg Gly Ser Arg 


Pro Ser Trp Gly 


Pro Thr 


Asp 


Pro 




100 


105 


110 






Arg 


Arg Arg Ser Arg Asn 


Leu Gly Lys Val 


He Asp 


Thr 


Leu 




115 


120 




125 




Thr 


Cys Gly Phe Ala Asp 


Leu Met Gly Tyr 


He Pro 


Leu 


Val 




130 


135 






140 


Gly 


Ala Pro Leu Gly Gly 


Ala Ala Arg Ala 


Leu Ala 


His 


Gly 




145 


150 








Val 


Arg Val Leu Glu Asp 


Gly Val Asn Tyr 


Ala Thr 


Gly 


Asn 


155 


160 




165 






Leu 


Pro Gly Cys Ser Phe 


Ser He Phe Leu 


Leu Ala 


Leu 


Leu 




170 


175 


180 






Ser 


Cys Leu . Thr Val Pro 


Ala Ser Ala 









185 190 



(2) INFORMATION FOR SEQ ID NO: 161: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid r\ 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAIO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys Pro Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


1 






5 






10 








Thr 


Asn 


Arg 


Arg Pro 


Gin Asp Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 






25 






Gin 


He 


Val 


Gly Gly 


Val Tyr Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 






35 








40 




Arg 


Leu 


Gly 


Val Arg 


Ala Thr Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 




45 




50 










55 


Gin 


Pro 


Arg 


Gly Arg 


Arg Gin Pro 


He 


Pro 


Lys 


Ala 


Arg Gin 






60 




65 








70 


Pro 


Glu 


Gly 


Arg Thr 


Trp Ala Gin 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 






75 






80 









20 



(i) 
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10 



Leu 


Tyr Gly Asn Glu Gly 


Leu 


vjiy 


Trp 


AJ.a Vaiy 


Trp 


Leu Leu 


85 


90 








95 






Ser 


Pro Arg Gly Ser Arg 


Pro 


Ser 


Trp 


c»iy Pro 


Thr 


Asp Pro 




100 


105 








110 




Arg 


Arg Arg Ser Arg Asn 


Leu 


Gly 


Lys 


Val He 


Asp 


Thr Leu 




115 




120 








125 


Thr 


Cys Gly Phe Ala Asp 


Leu 


Met 


Gly 


Tyr He 


Pro 


Leu Val 




130 






135 






140 


Gly 


Ala Pro Leu Gly Gly Ala 


Ala 


Arg 


Ala Leu 


Ala 


His Gly 


145 








150 






Val 


Arg Val Leu Glu Asp Gly 


Val 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 


160 








165 






Leu 


Pro Gly Cys Pro Phe 


Ser 


lie 


Phe 


Leu Leu 


Ala 


Leu Leu 




170 


175 








180 




Ser 


Cys Leu Thr lie Pro 


Ala 


Ser 


Ala 










185 




190 











(2) INFORMATION FOR SEQ ID NO: 162: 



(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



20 



25 



30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:;^ 162 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Ala 


Thr 


Lys 


Arg Asn 


1 






5 










10 








Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 










25 






Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 






35 










40 




Arg 


Leu 


Gly 


Val_ Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 




45 








50 










55 


Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


lie 


Pro 


Lys 


Ala 


Arg Arg 






60 








65 








70 


Pro 


Glu 


Gly 


Arg Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


His 


Pro 


Trp Pro 






75 










80 








Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 






90 










95 






Ser 


Pro 


Arg 


Gly Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp Pro 




100 








105 










110 




Arg 


Arg 


Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr Leu 


115 








120 










125 


Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu Val 




130 








135 








140 
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Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr He Pro Ala Ser Ala 
185 190 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Dl 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 



10 



20 



25 



30 



Met 


Ser 


Thr Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 






5 








10 










Thr 


Asn 


Arg Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 






20 










25 








Gin 


He 


Val Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 




35 










40 






Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 


Gin 




45 








50 


Ile^ 
65 








55 




Pro 


Arg Gly 
60 


Arg 


Arg 


Gin 


Pro 


Pro 


Lys 


Ala 


Arg 


Arg 
70 


Pro 


Glu 


Gly Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 






75 










80 










Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 




90 










95 








Ser 


Pro 


Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 






105 










110 






Arg 


Arg 


Arg Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


115 








120 










125 




Thr 


Cys 


Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 










135 










140 


Gly 


Ala 


Pro Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 






145 










150 










Val 


Arg 


Val Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly- 


Asn 


155 






160 










165 








Leu 


Pro 


Gly Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu Thr 


He 


Pro 


Ala 


Ser 


Ala 














185 








190 
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(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 



Met 


Ser Thr Asn Pro 


Lys 


Pro 


Gin Arg 


Lvs Thr Lvs 


Arcr Afln 


1 


5 








10 


Thr 


Asn Arg Arg Pro 


Gin 


Asp 


Val Lys 


Phe Pro Gly 


Gly Gly 


15 




20 






25 


Gin 


He Val Gly Gly 


Val 


Tyr 


Leu Leu 


Pro Arg Arg 


Gly Pro 




30 




35 




40 


Arg 


Leu Gly Val Arg 


Ala 


Thr 


Arg Lys 


Thr Ser Glu 


Arg Ser 




45 






50 




55 


Gin 


Pro Arg Gly Arg 


Arg 


Gin 


Pro He 


Pro Lys Ala 


Arg Arg 




60 






65 


70 


Pro 


Glu Gly Arg Ala 


Trp 


Ala 


Gin Pro 


Gly Tyr Pro 


Trp Pro 




75 








80 


Leu 


Tyr Gly Asn Glu 


Gly 


Met 


Gly Trp 


Ala Gly Trp 


Leu Leu 


85 




90 






95 




Ser 


Pro Arg Gly Ser 


Arg 


Pro 


Ser Trp 


Gly Pro Thr 


Asp Pro 




100 




105 




110 


Arg 


Arg Arg Ser Arg 


Asn 


Leu 


Gly Lys 


Val He Asp 


Thr Leu 




115^ 






120 


125 


Thr 


Cys Gly Phe Ala 


Asp 


Leu 


Met Gly 


Tyr He Pro 


Leu Val 




130 






135 




140 


Gly 


Ala Pro Leu Gly 


Gly 


Ala 


Ala Arg 


Ala Leu Ala 


His Gly 




145 








150 


Val 


Arg Val Leu Glu 


Asp 


Gly 


Val Asn 


Tyr Ala Thr 


Gly Asn 


155 




160 






165 


Leu 


Pro Gly Cys Ser 


Phe 


Ser 


He Phe 


Leu Leu Ala 


Leu Leu 




170 




175 




180 




Ser 


Cys Leu Thr He 


Pro 


Ala 


Ser Ala 








185 






190 







30 (2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



35 



(vi) ORIGINAL SOURCE: 



wo 96/05315 



PCT/US95/10398 



10 



15 



- 189 - 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: PIO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165; 

Met Ser Thr Asn Pro Lys Pro Gin Arg 

1 5 
Thr Asn Arg Arg Pro Gin Asp Val Lys 

15 20 
Gin He Val Gly Gly Val Tyr Leu Leu 

30 35 
Arg Leu Gly Val Arg Ala Thr Arg Lys 

45 50 
Gin Pro Arg Gly Arg Arg Gin Pro He 
60 65 
Pro Glu Gly Arg Ala Trp Ala Gin Pro 

75 

Leu Tyr Gly Asn Glu Gly Leu Gly Trp 

85 90 
Ser Pro Arg Gly Ser Arg Pro Ser Trp 

100 105 
Arg Arg Arg Ser Arg Asn Leu Gly Lys 

115 120 
Thr Cys Gly Phe Ala Asp Leu Met Gly 
130 135 
Gly Ala Pro Leu Gly Gly Ala Ala Arg 
145 

Val Arg Val Leu Glu Asp Gly Val Asn 
20 155 160 
f\ Leu Pro Gly Cys Ser Phe Ser lie Phe 

170 175 180 

Ser Cys Leu Thr He Pro Ala Ser Ala 
185 190 

25 (2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DKl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 
1 5 10 



Lys Thr 


Lys 


Arg 


Asn 


10 








Phe Pro 


Gly 


Gly 


Gly 


25 








Pro Arg 


Arg 


Gly 


Pro 




40 






Thr Ser 


Glu 


Arg 


Ser 






55 




Pro Lys 


Ala 


Arg 


Arg 








70 


Gly Tyr 


Pro 


Trp 


Pro 


80 








Ala Gly 


Trp 


Leu 


Leu 


95 








Gly Pro 


Thr 


Asp 


Pro 


110 






Val He 


Asp 


Thr 


Leu 




125 




Tyr He 


Pro 


Leu 


Val 






140 


Ala Leu 


Ala 


His 


Gly 


150 








Tyr Ala 


Thr 


Gly 


Asn 


165 








Leu Leu 


Ala 


Leu 


Leu 



35 



wo 96/05315 



PCr/US95/ia398 



- 190 - 

o 



10 



Thr 


Asn 


Arg Arg 


Pro 


Gin Asp Val 


Lys 


Phe Pro 


Gly 


Gly Gly 


15 








20 




25 






Gin 


He 


Val Gly 


Gly Val Tyr Leu 


Leu 


Pro Arg 


Arg 


Glv 






30 






35 




40 




Arg 


Leu 


Gly Val 


Arg 


Ala Thr Arg 


Lys 


Thr Ser 


Glu 










45 




50 












Gin 


Pro 


Arg Gly 


Arg 


Arg Gin Pro 


He 


Pro Lys 


Ala 




Arg 






60 






65 






1 u 


Pro 


Glu 


Gly Arg 


Ala 


Trp Ala Gin 


Pro 


Gly Tyr 


Pro 


Trp 


Pro 








75 






80 






Leu 




Gly Asn 


Glu 


Gly Met Gly 


Trp 


Ala Gly 


Tro 


Leu 


Leu 


85 








90 




95 






Ser 


Pro 


Arg Gly 


Ser 


Arg Pro Ser 


Trp 


Gly Pro 


Asn 


Asp 


Pro 




100 






105 






110 




Arg 


Arg 


Arg Ser 


Arg Asn Leu Gly 


Lys 


Val He 


Asp 


Thr 


Leu 






115 




120 






125 




Thr 


Cys 


Gly Phe 


Ala 


Asp Leu Met 


Gly 


Tyr He 


Pro 


Leu 


Val 






130 






135 






140 


Gly Ala 


Pro Leu 


Gly Gly Ala Ala 


Arg 


Ala Leu 


Ala 


His 


Gly 








145 






150 






Val 


Arg 


Val Leu 


Glu 


Asp Gly Val 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 








160 




165 








Leu 


Pro 


Gly Cys 


Ser 


Phe Ser He 


Phe 


Leu Leu 


Ala 


Leu 


Leu 




170 






175 






180 






Ser 


Cys 


Leu Thr 


He 


Pro Ala Ser 


Ala 











185 190 



(2) INFORMATION FOR SEQ ID NO: 167: 

SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
{ D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: TIO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



35 



Met 


Ser Thr Asn 


Pro 


Lys Pro Gin 


Arg 


Lys Thr Lys Arg Asn 


1 




5 






10 


Thr 


Asn Arg Arg 


Pro 


Gin Asp Val 


Lys 


Phe Pro Gly Gly Gly 


15 






20 




25 


Gin 


He Val Gly 


Gly 


Val Tyr Leu 


Leu 


Pro Arg Arg Gly Pro 




30 




35 




40 


Arg 


Leu Gly Val 


Arg 


Ala Thr Arg 


Lys 


Thr Ser Glu Arg Ser 




45 




50 




55 


Gin 


Pro Arg Gly 


Arg 


Arg Gin Pro 


He 


Pro Lys Ala Arg Gin 




60 






65 


70 


Pro Glu Gly Arg 


Ala 


Trp Ala Gin 


Pro 


Gly Tyr Pro Trp Pro 






75 






80 



(i) 



wo 96/05315 PCT/US95/103M 



- 191 



10 



15 



20 



25 



30 



Leu 


Tvr 


Gly 


Asn 


Glu 


Gly 


Met 


Gly 


Trp 


Ala Gly 


Trp 


Leu Leu 


85 






90 








95 






Ser 


Pro 


Arq 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly Pro 


Thr 


Asp Pro 




100 






105 








110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val He 


Asp 


Thr Leu 


115 










120 








125 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr He 


Pro 


Leu Val 




130 










135 






140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala Leu 


Ala 


His Gly 








145 










150 






Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 








160 








165 






Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu Leu 


Ala 


Leu Leu 




170 






175 








180 




Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 










185 










190 











(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknovm 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 

SEQUENCE DESCRIPTION: SEQ ID NO: 168 







(xi) 


Met 


Ser 


Thr 


1 






Thr 


Asn 


Arg 


15 






Gin 


lie 


Val 




30 




Arg 


Leu 


Gly 






45 


Gin 


Pro 


Arg 


Pro 


Glu 


Gly 


Leu 


Tyr 


Gly 


85 






Ser 


Pro 


Arg 




100 




Arg 


Arg 


Arg 






115 


Thr 


Cys 


Gly 



60 



Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


5 








10 










Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 




20 








25 








Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




35 










40 






Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






50 










55 




Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 






65 










70 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 


75 








80 










Glu 


Gly 
90 


Met 


Gly 


Trp 


Ala 


Gly 
95 


Trp 


Leu 


Leu 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




105 










110 






Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






120 










125 




Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 



35 



130 135 140 



wo 96/05315 
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- 192 - 

o 

Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr lie Pro Ala Ser Ala 
5 185 190 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
|/v (B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND3 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 



20 



Met 


Ser 


Thr 


Asn Pro 


Lys Pro Gin 


Arg 


Lys 


Thr Lys 


Arg 


Asn 


1 






5 






10 




Thr 


Asn 


Arg 


Arg Pro 


Gin Asp Val 


Lys 


Phe 


Pro Gly 


Gly 


Gly 


15 








20 






25 




Gin 


He 


Val 


Gly Gly 


Val Tyr Leu 


Leu 


Pro 


Arg Arg 


Gly 


Pro 




30 






35 






40 




Arg 


Leu 


Gly 


Val Arg 


Ala Thr Arg 


Lys 


Thr 


Ser Glu 


Arg 


Ser 


Gin ^ Pro 


45 




50 








55 




Arg 


Gly Arg 


Arg Gin Pro 


He 


Pro 


Lys Ala 


Arg 


Arg 








60 




65 








70 


Pro 


Glu 


Gly 


Arg Ala 


Trp Ala Gin 


Pro 


Gly 


Tyr Pro 


Trp 


Pro 








75 






60 






Leu 


Tyr 


Gly 


Asn Glu 


Gly Leu Gly 


Trp 


Ala 


Gly Trp 


Leu 


Leu 


65 








90 






95 






Ser 


Pro 
100 


Arg 


Gly Ser 


Arg Pro Ser 
105 


Trp 


Gly 


Pro Thr 
110 


Asp 


Pro 


Arg 


Arg 


Arg 
115 


Ser Arg 


Asn Leu Gly 
120 


Lys 


Val 


He Asp 


Thr 
125 


Leu 


Thr 


Cys 


Gly 


Phe Ala 
130 


Asp Leu Met 


Gly 
135 


Tyr 


He Pro 


Leu 


Val 
140 


Gly 


Ala 


Pro 


Leu Gly 


Gly Ala Ala 


Arg 


Ala 


Leu Ala 


His 


Gly 








145 






150 






Val 


Arg 


Val 


Leu Glu 


Asp Gly Val 


Asn 


Tyr 


Ala Thr. 


Gly 


Asn 


155 








160 






165 






Leu 


Pro 
170 


Gly 


Cys Ser 


Phe Ser He 
175 


Phe 


Leu 


Leu Ala 
180 


Leu 


Leu 


Ser 


Cys 


Leu 


Thr He 


Pro Ala Ser 


Ala 












185 




190 













35 



wo 96/05315 



PCT/US95/10398 



- 193 - 



0 



(2) 



INFORMATION FOR SEQ ID NO: 170: 



5 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
{ D ) TOPOLOGY : unknown 



(vi) 



ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: IND8 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 170: 



15 



25 



Met 


Ser 


Thr Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr Lys 


Arg 


Asn 


1 






5 










10 








Thr 


Asn 


Arg Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro Gly 


Gly 


Gly 


15 






20 










25 






Gin 


He 


Val Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg Arg 


Gly 


Pro 




30 






35 








40 






Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser Glu 


Arg 


Ser 




45 








50 








55 




Gin 


Pro 


Arg Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys Ala 


Arg 


Arg 






60 








65 








70 


Pro 


Glu 


Gly Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


His Pro 


Trp 


Pro 






75 










80 








Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly Trp 


Leu 


Leu 


85 




90 
















Ser 


Pro 


Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro Thr 


Asp 


Pro 




100 






105 






110 






Arg 


Arg 


Arg Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He Asp 


Thr 


Leu 


115 






/A 


120 








125 




Thr 


Cys 


Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He Pro 


Leu 


Val 




130 










135 








140 


Gly 


Ala 


Pro Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu Ala 


His 


Gly 






145 










150 








Val 


Arg 


Val Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala Thr 


Gly 


Asn 


155 






160 










165 






Leu 


Pro 


Gly Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu Ala 


Leu 


Leu 




170 






175 








180 






Ser 


Cys 


Leu Thr 


Val 


Pro 


Ala 


Ser 


Ala 












185 








190 













30 (2) INFORMATION FOR SEQ ID NO: 171: 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 



35 



(vi) 



ORIGINAL SOURCE: 



wo 96/05315 



PCrAJS95/10398 



10 



15 



20 



.194 - 



(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S9 





(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 171: 


Met 


Ser Thr Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr Lys 


Arg Asn 


1 




5 










10 




Thr 


Asn Arg Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro Gly 


Gly Gly 


15 






20 










25 


Gin 


lie Val Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg Arg 


Gly Pro 




30 






35 








40 


Arg 


Leu Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser Glu 


Arg Ser 




45 








50 








55 


Gin 


Pro Arg Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys Ala 


Arg His 




60 










65 






70 


Pro 


Glu Gly Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr Pro 


Trp Pro 






75 










80 




Leu 


Tyr Gly Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly Trp 


Leu Leu 


85 






90 










95 




Ser 


Pro Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro Asn 


Asp Pro 




100 






105 








110 




Arg 


Arg Arg Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He Asp 


Thr Leu 




115 








120 








125 


Thr 


Cys Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He Pro 


Leu Val 




130 










135 






140 


Gly 


Ala Pro Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu Ala 


His Gly 






145 










150 






Val 


Arg Val Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala Thr 


Gly Asn 


155 






160 










165 




Leu 


Pro Gly Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu Ala 


Leu Leu 




170 






175 






180 




Ser 


Cys Leu Thr 


He 


Pro 


Ala 


Ser 


Ala /A 








185 








190 











(2) INFORMATION FOR SEQ ID NO: 172: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: cunino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 
15 10 
2< Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 
15 20 25 



PCT/US95/10398 



- 195 - 

o 



10 



15 



Gin lie Val Gly Gly Val 


Tyr Leu 


Leu 


Pro Arg 


Arg 


Gly 


Pro 


30 


35 






40 






Arg Leu Gly Val Arg Ala 


Thr Arg 


Lys 


Thr Ser 


Glu 


Arg 


Ser 


45 


50 








55 




Gin Pro Arg Gly Arg Arg 


Gin Pro 


He 


Pro Lys 


Ala 


Arg 


Gin 


60 




65 








70 


Pro Glu Glv Atq Thr Trp 


Ala Gin 


Pro 


Glv Tvr 


Pro 


Trp 


Pro 


75 






80 








Tieii Tvr Glv Asn Glu Glv 


Met Gly 


Tm 


Ala Gly 


TrtD 


Leu 


Leu 


P5 90 






95 








Ser Pro Aira Glv Ser Ara 


Pro Asn 


Tro 


Gly Pro 


Thr 


Asp 


Pro 


inn 

X w w 


105 






110 






Arg Arg Arg Ser Arg Asn 


Leu Gly 


Lys 


Val He 


Asp 


Thr 


Leu 


115 


120 








125 




Thr Cys Gly Phe Ala Asp 


Leu Met 


Gly 


Tyr He 


Pro 


Leu 


Val 


130 




135 








140 


Gly Ala Pro Leu Gly Gly 


Val Ala 


Arg 


Ala Leu 


Ala 


His 


Gly 


145 






150 








Val Arg Val Leu Glu Asp 


Gly Val 


Asn 


Tyr Ala 


Thr 


Gly 


Asn 


155 160 






165 








Leu Pro Gly Cys Ser Phe 


Ser He 


Phe 


Leu Leu 


Ala 


Leu 


Leu 


170 


175 






180 






Ser Cys Leu Thr Thr Pro 


Ala Ser 


Ala 










185 


190 













(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



25 



30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 



35 







(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 173: 


Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


1 






5 








10 








Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 




20 










25 






Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 






35 










40 




Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 




45 








50 










55 


Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg Arg 






60 








65 








70 


Pro 


Glu 


Gly 


Arg Thr 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 






75 










80 
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10 



Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Met 


Gly 


Trp 


Ala Gly 


Trp 


Leu Leu 


85 










90 








95 






Ser 


Pro 


His 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 

XT 


Gly Pro 


Thr 


Asp Pro 




100 










105 








110 


Arq 


Arq 


Arq 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val He 


ASD 


Thr Leu 






115 










120 








125 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr He 


Pro 


Leu Val 








130 










135 






140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala Leu 


Ala 


His Gly 










145 










150 




Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 










160 








165 




He 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu Leu 


Ala 


Leu Leu 




170 






175 








180 




Ser 


Cys 


Leu 


Thr 


Thr 


Pro 


Val 


Ser 


Ala 










185 










190 











(2) INFORMATION FOR SEQ ID NO: 174: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 



25 



30 



35 



Met 


Ser Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg^ Lys 


Thr 


Lys 


Arg 


Asn 


1 




5 










10 










Thr 


Asn Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 






20 










25 








Gin 


He Val 
30 


Gly Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro 


Arg 


Arg 
40 


Gly 


Pro 


Arg 


Leu Gly 
45 


Val Arg 


Ala 


Thr 


Arg 
50 


Lys 


Thr 


Ser 


Glu 


Arg 
55 


Ser 


Gin 


Pro Arg 


Gly Arg 
60 


Arg 


Gin 


Pro 


He 
65 


Pro 


Lys 


Ala 


Arg 


Gin 
70 


Pro 


Glu Gly 


Arg Thr 
75 


Trp 


Ala 


Gin 


Pro 


Gly 
80 


Tyr 


Pro 


Trp 


Pro 


Leu 


Tyr Gly 


Asn Glu 


Gly 


Met 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 






90 










95 








Ser 


Pro Arg 
100 


Gly Ser 


Arg 


Pro 
105 


Ser 


Trp 


Gly 


Pro 


Thr 
110 


Asp 


Pro 


Arg 


Arg Arg 
115 


Ser Arg 


Asn 


Leu 


Gly 
120 


Lys 


Val 


He 


Asp 


Thr 
125 


Leu 


Thr 


Cys Gly 


Phe Ala 
130 


Asp 


Leu 


Met 


Gly 
135 


Tyr 


He 


Pro 


Leu 


Val 
140 


Gly 


Ala Pro 


Leu Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 




145 










150 
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o 

Val Arg Val Val Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 

155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr lie Pro Ala Ser Ala 
185 190 

5 

(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS : unlcnown 
10 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P8 



15 



20 



25 



30 





(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 175: 


Met 


Ser Thr 


Thr Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr Lys 


Arg 


Asn 


1 




5 










10 








Thr 


Ser Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro Gly 


Gly 


Gly 


15 


20 










25 






Gin 


He Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg Arg 


Gly 


Pro 




30 




35 








40 






Arg 


Leu Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser Glu 


Arg 


Ser 


45 








50 








55 




Gin 


Pro Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys Ala 


Arg 


Arg 




60 








65 










Pro 


Glu Gly 


Arg Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


His Pro 


Trp 


Pro 




75 










80 








Leu 


Tyr Ala 


Asn Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly Trp 


Leu 


Leu 


85 




90 










95 






Ser 


Pro Arg 


Gly Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro Thr 


Asp 


Pro 




100 




105 








110 






Arg 


Arg Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He Asp 


Thr 


Leu 


115 








120 








125 




Thr 


Cys Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He Pro 


Leu 


Val 




130 








135 








140 


Gly 


Gly Pro 


Leu Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu Ala 


His 


Gly 


145 










150 








Val 


Arg Val 


Val Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala Thr 


Gly 


Asn 


155 




160 










165 






Leu 


Pro Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu Ala 


Leu 


Leu 




170 




175 








180 






Ser 


Cys Leu 


Thr He 


Pro 


Ala 


Ser 


Ala 












185 








190 













35 



(2) INFORMATION FOR SEQ ID NO: 176: 
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o 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unJcnovm 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 



10 



25 



Met 


Ser Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys Thr 


LVS 


Arg Asn 


1 




5 










10 






Thr 


Asn Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe Pro 


Gly 


Gly Gly 


15 






20 








25 






Gin 


He Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro Arg 


Arg 


Gly Pro 




30 






35 








40 




Arg 


Leu Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr Ser 


Glu 


Arg Ser 




45 








50 








55 


Gin 


Pro Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro Lys 


Ala 


Arg Arg 






60 








65 






70 


Pro 


Glu Gly 


Arg Ala 


Trp 


Ala 


Gin 


Pro 


Gly Tyr 


Pro 


Trp Pro 






75 










80 






Leu 


Tyr Gly 


Asp Glu 


Gly 


Met 


Gly 


Trp 


Ala Gly 


Trp 


Leu Leu 


85 






90 








95 






Ser 


Pro Arg 


Gly Ser 


Arg 


Pro 


Asn 


Trp 


Gly Pro 


Thr 


Asp Pro 




100 






105 








110 




Arg 


Arg Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val He 


Asp 


Thr Leu 




115 








120 








125 


Thr 


Cys Gly 


Phe Ala 


Asp 


Leu 


Met/^piy 


Tyr He 


Pro 


Leu Val 






130 








135 






140 


Gly 


Ala Pro 


Leu Gly 


Gly 


Val 


Ala Arg 


Ala Leu 


Ala 


His Gly 






145 










150 






Val 


Arg Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 






160 








165 






Leu 


Pro Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


Leu Leu 


Ala 


Leu Leu 




170 




175 








180 




Ser 


Cys Leu 


Thr He 


Pro 


Ala 


Ser 


Ala 









185 190 



(2) INFORMATION FOR SEQ ID NO: 177: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: untoown 

(D) TOPOLOGY: unknown 



^. (vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: homosapiens 
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10 



15 



199 - 



(C) INDIVIDUAL ISOLATE: T4 





(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 177: 


Met 


Ser Thr 


Ash Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr Lys 


Arg Asn 


1 




5 










lU 






Thr 


Asn Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro Gly 


Gly Gly 


15 




20 










25 




Gin 


He Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg Arg 


Gly Pro 




30 




35 












Arg 


Leu Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser Glu 


Arg Ser 


45 








50 








55 


Gin 


Pro Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys Asp 


Arg Arg 




60 








65 






70 


Ser 


Thr Gly 


Lys Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr Pro 


Trp Pro 




75 










80 






Leu 


Tyr Gly 


Asn Glu 


Gly 


Leu 


GJ.y 


Trp 


A±a 


<jj.y i rp 


T Ai 1 T . Ai 1 
iJcU 


85 




90 










95 




Ser 


Pro Arg 


Gly Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro Asn 


Asp Pro 




100 






105 








110 




Arg 


His Arg 


Ser Arg 


Asn 


Val 


Gly 


Lys 


Val 


He Asp 


Thr Leu 


115 








120 








125 


Thr 


Cys Ser 


Leu Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


Val Pro 


Val Val 




130 








135 






140 


Gly 


Gly Pro 


Leu Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu Ala 


His Gly 


145 










150 






Val 


Arg Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala Thr 


Gly Asn 


155 




160 










165 




Leu 


Pro Gly 


Cys Ser 


Phe 


Ser 


lie 


Phe 


Leu 


Leu Ala 


Leu Leu 




170 




175 








180 




Ser 


Cys lie 


Thr He 


Pro 


Val 


Ser 


Ala 










185 








190 











20 



(2) INFORMATION FOR SEQ ID NO: 178: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 ainino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
30 (C) INDIVIDUAL ISOLATE: US 10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 

35 15 20 25 
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10 



Gin 


lie 


Val 


Glv 




30 










Glv 


Val 






45 




Gin 




Arcr 


Glv 








60 


Pro 


Thr 


Gly 


Lys 


L6U 


Tvr 


Glv 


Asn 


85 








OCX. 


Ptt* 

IT 1, \J 




Glv 




100 






Arg 


His 


Arg 


Ser 






115 




Thr 


Cys 


Gly 


Phe 








130 


Gly 


Ala 


Pro 


Leu 


Val 


Arg 


Val 


Leu 


155 








Leu 


Pro 


Gly 


Cys 




170 






Ser 


Cys 


He 


Thr 



185 
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Glv 


Val 


Tvr 
35 


Leu 


Leu 


ArcT 


Ala 


Thr 


ArCT 
50 


Lys 


Ara 


Ara 


Gin 


Pro 


He 
65 


Ser 




Glv 

ox/ 


J-Jjr 0 


Pro 

X w 


75 










Glu 


Glv 
90 


Leu 


Glv 


Tro 

X xp 






1 n^ 


OCX 


irp 


Arg 


Asn 


Val 


Gly 
120 


Lys 


Ala 


Asp 


Leu 


Met 


Gly 
135 


Gly 


Gly 


Val 


Ala 


Arg 


145 










Glu 


Asp 
160 


Gly 


Val 


Asn 


Ser 


Phe 


Ser 
175 


He 


Phe 


He 


Pro 


Val 


Ser 


Ala 



190 



Pro Arcr 




Glv 
w»xy 


Xr X \J 




40 






X liX wCX 


Gin 
ox u 




OCX 










Pyt> Tiva 

£^ X w Ai jr O 














70 




XrX 


irp 


ArxO 


80 








Ala Glv 


1 


AJCU 


ucu 










Gly Pro 


Thr 


Asp 


Pro 




.110 






Val He 


Asp 


Thr 


Leu 




125 




Tyr He 


Pro 


Val 


Val 








140 


Ala Leu 


Ala 


His 


Gly 


150 






Tyr Ala 


Thr 


Gly 


Asn 


165 








Leu Leu 


Ala 


Leu 


Leu 



180 



(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

20 (B) TYPE: amino acid 

(C) STRANDBONESS : unknown 

A (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 



25 



30 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179; 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro Gin 


Arg 


Lys 


Thr 


He 


Arg Asn 


1 






5 








10 








Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 








25 






Gin 


He 
30 


Val 


Gly Gly 


Val 


Tyr Leu 
35 


Leu 


Pro 


Arg 


Arg 
40 


Gly Pro 


Arg 


Leu 


Gly 


Val Arg 


Thr 


Thr Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 




45 






50 










55 


Gin 


Pro 


Arg 


Gly Arg 
60 


Arg 


Gin Pro 


He 
65 


Pro 


Lys 


Asp 


Arg Arg 
70 


Ser 


Thr 


Gly 


Lys Ser 
75 


Trp 


Gly Lys 


Pro 


Gly 
80 


Tyr 


Pro 


Trp Pro 
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10 



Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala Gly 


Trp 


Leu 


Leu 


85 








90 








95 








Ser 


Pro 
100 


Arg 


Gly 


Ser 


Arg 


Pro 
105 


Ser 


Trp 


Gly Pro 


Ser 
110 


Asp 


Pro 


Arg 


His 


Arg 
115 


Ser 


Arg 


Asn 


Val 


Gly 
120 


Lys 


Val He 


Asp 


Thr 
125 


Leu 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr He 


Pro 


Val 


Val 






130 










135 








140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala Leu 


Ala 


His 


Gly 








145 










150 








Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 








160 








165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu Leu 


Ala 


Leu 


Leu 




170 






175 








180 






Ser 


Cys 


He 


Thr 


Thr 


Pro 


Ala 


Ser 


Ala 












185 










190 













(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
15 (B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 



25 



35 



Met 


Ser 


Thr He 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys /Arg 


Asn 


1 






5 








10 










Thr 


Asn 


Arg Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 






20 










25 








Gin 


He 


Val Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 




35 










40 






Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 








50 










55 




Gin 


Pro 


Arg Gly 
60 


Arg 


Arg 


Gin 


Pro 


He 
65 


Pro 


Lys 


Asp 


Arg 


Arg 
70 


Ser 


Thr 


Gly Lys 


Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 






75 










80 










Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 




90 










95 








Ser 


Pro 


Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 






105 










110 






Arg 


His 


Arg Ser 


Arg 


Asn 


Val 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 




115 








120 










125 




Thr 


Cys 


Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


11 


Pro 


Val 


Val 




130 










135 










140 


Gly 


Ala 


Pro Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 



145 150 
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Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 

155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys lie Thr lie Pro Val Ser Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D) TOPOLOGY: unlcnown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 



25 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 






5 










10 










Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 








20 










25 








Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 








35 










40 






Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 








50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Asp 


Arg 


Arg 








60 








65 










70 


Ser 


Thr 


Gly 


Lys Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 








75 










80 










Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 








90 










95 








Ser 


Pro 


Arg 


Gly Ser 


Arg 


Pro 


Thr 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 








105 










110 






Arg 


His 


Arg 


Ser Arg 


Asn 


Leu 


Gly 


Arg 


Val 


He 


Asp 


Thr 


He 






115 








120 










125 




Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val 


Val 




130 








135 










140 


Giy 


Ala 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 






145 










150 










Val 


Arg 


Val 


Leu Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 






160 










165 








Leu 


Pro 


Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 




175 










180 






Ser 


Cys 


Phe 


Thr Val 


Pro 


Val 


Ser 


Ala 













.A 



185 190 



35 
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(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: USl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 



10 



20 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr Lys 


Arg Asn 


1 






5 










10 






Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro Gly 


Gly Gly 


15 








20 










25 - 




Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg Arg 


Gly Pro 




30 








35 








40 




Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser Glu 


Arg Ser 




45 








50 








55 


Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys Asp 


Arg Arg 








60 








65 






70 


Ser 


Thr 


Gly 


Lys Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr Pro 


Trp Pro 








75 










80 






Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly Trp 


Leu Leu 


85 








90 










95 




Ser 


Pro 


Arg 


Gly Ser 


Arg 


Pro 


Thr 


Trp 


Gly 


Pro Thr 


Asp Pro 




100 








105 








110 




Arg 


His 


Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He Asp 


Thr He 




115 








120 








125 


Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He . Pro 


Val Val 




130 








135 






140 


Gly 


Ala 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu Ala 


His Gly 






145 










150 






Val 


Arg 


Val 


Leu Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala Thr 


Gly Asn 


155 






160 










165 




Leu 


Pro 


Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu Ala 


Leu Leu 




170 




175 








180 




Ser 


Cys 


Ala 


Thr Val 


Pro 


Val 


Ser 


Ala 









185 190 



(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



35 



(vi) 



ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
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(C) INDIVIDUAL ISOLATE: DKll 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

15 10 
Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 

15 20 25 

Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 

30 35 40 

Arg Leu Gly Val Arg Thr Thr Arg Lys Thr Ser Glu Arg Ser 

45 50 55 

Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys Asp Arg Arg 
60 65 70 

Ser Thr Gly Lys Pro Trp Gly Lys Pro Gly Tyr Pro Trp Pro 

75 80 
Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu 

85 90 95 

Ser Pro Arg Gly Ser His Pro Asn Trp Gly Pro Thr Asp Pro 

100 105 110 

Arg His Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr lie 
15 115 120 125 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Val Val 
130 135 140 

Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leii Ala Leu Leu 
20 170 175 180 

Ser Cys Cys Thr Val Pro Val Ser Ala 

185 190 ,:\ 

(2) INFORMATION FOR SEQ ID NO: 184: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unJoiown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
30 (C) INDIVIDUAL ISOLATE: SW3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

15 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 

35 15 20 25 
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- 205 - 



10 



Gin 


He Val Gly Gly 


Val Tyr Leu 


Leu 


Pro Arg 


Arg 


Gly 


Pro 




30 


35 






40 




Arg 


Leu Gly Val Arg 


Ala Thr Arg 


Lys 


Thr Ser 


Glu 


Arg 


Ser 




45 


50 








55 




Gin 


Pro Arg Gly Arg 


Arg Gin Pro 


He 


Pro Lys 


Asp 


Arg 


Arg 




60 




65 








70 


Ser 


Thr Gly Lys Ser 


Trp Gly Lys 


Pro 


Gly Tyr 


Pro 


Trp 


Pro 




75 






80 








Leu 


Tyr Gly Asn Glu 


Gly Cys Gly 


Trp 


Ala Gly 


Trp 


Leu 


Leu 


85 




90 




95 








Ser 


Pro Arg Gly Ser 


His Pro Asn 


Trp 


Gly Pro 


Thr 


Asp 


Pro 




100 


105 






110 






Arg 


His Arg Ser Arg 


Asn Leu Gly 


Lys 


Val He 


Asp 


thr 


He 




lis 


120 








125 




Thr 


Cys Gly Phe Ala 


Asp Leu Met 


Gly 


Tyr He 


Pro 


Val 


Val 




130 




135 








140 


Gly 


Ala Pro Val Gly 


Gly Val Ala 


Arg 


Ala Leu 


Ala 


His 


Gly 




145 






150 








Val 


Arg Val Leu Glu 


Asp Gly He 


Asn 


Tyr Ala 


Thr 


Gly 


Asn 


155 




160 




165 








Leu 


Pro Gly Cys Ser 


Phe Ser He 


Phe 


Leu Leu 


Ala 


Leu 


Leu 




170 


175 






180 






Ser 


Cys Phe Thr Val 


Pro Val Ser 


Ala 












185 


190 













15 



(2) INFORMATION FOR SEQ ID NO: 185: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDBDNESS : unJcnovm 
/ \ (D) TOPOLOGY: untaown /A 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185; 



30 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


1 






5 










10 








Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 










25 






Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 








35 










40 




Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Ser 


Ser 


Glu 


Arg Ser 






45 








50 










55 


Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Asp 


Arg Arg 








60 








65 








70 


Ser 


Thr 


Gly 


Lys Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 






75 










80 








Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 
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Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Thr 


Trp 


Gly Pro 


Thr 


Asp Pro 




100 










105 








110 




Arg 


His 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val He 


Asp 


Thr He 






115 










120 








125 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr He 


Pro 


Val Val 








130 










135 






140 


Gly 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala Leu 


Ala 


His Gly 










145 










150 






Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 










160 








165 






Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu Leu 


Ala 


Leu Leu 




170 






175 








180 




Ser 


Cys 


Cys 


Thr 


Val 


Pro 


Val 


Ser 


Ala 










185 










190 











(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

^5 (C) STRANDEDNESS : unkJiown 

(D) TOPOLOGY: untaown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S83 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

20 



25 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


1 






5 










10 








Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 










25 






Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 








35 










40 




Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 






45 








50 










55 


Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


-He 


Pro 


Lys 


Asp 


Arg Arg 








60 








65 








70 


Thr 


Thr 


Gly 


Lys Ser 


Trp 


Gly 


Arg 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 








75 










80 








Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 






90 










95 






Ser 


Pro 


Arg 


Gly Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp Pro 




100 








105 










110 




Arg 


His 


Lys 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr Leu 




115 








120 










125 


Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val Val 






130 








135 








140 


Gly 


Ala 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His Gly 



145 150 
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Val Arg Val Leu Glu Asp Gly 
155 160 
Leu Pro Gly Cys Ser Phe Ser 
170 175 
Ser Cys Ser Val Pro Val 

185 



lie Asn Tyr Ala Thr Gly Asn 
165 

lie Phe Leu Leu Ala Leu Leu 
180 

Ser Ala 
190 



(2) INFORMATION FOR SEQ ID NO: 187: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNESS : unknown 
(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: HKIO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 



25 



Met 


Ser 


Thr Leu 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


1 






5 










10 








Thr 


He 


Arg Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 










25 






Gin 


He 


Val Gly 


Gly 


Val 


Tyr 


Val 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 








35 










40 




Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 






45 








50 










55 


Gin 


Pro 


Arg Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg Arg 






60 










65 








70 


Ser 


Glu 


Gly Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 








75 










80 








Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 








90 










95 






Ser 


Pro 


Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp Pro 




100 








105 










110 




Arg 


Arg 


Arg Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr Leu 






115 - 








120 










125 


Thr 


Cys 


Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu Val 






130 










135 








140 


Gly 


Ala 


Pro Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His Gly 






145 










150 








Val 


Arg 


Ala Leu 


Glu 


Asp 


Gly 


He 


Asn 


Phe 


Ala 


Thr 


Gly Asn 


155 






160 










165 






Leu 


Pro 


Gly Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu Phe 




170 






175 










180 




Ser 


Cys 


Leu He 


His 


Pro 


Ala 


Ala 


Ser 











185 190 



(i) 



10 



35 
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(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unJcnown 

5 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S52 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 



10 



20 



25 



Met 


Ser 


Thr Leu 


Pro 


Lys 


Pro 


Gin 


Arq 


Lys 


Thr Lys 


Arg Asn 


1 






5 










10 






Thr 


lie 


Arq Arq 


Pro 


Gin 


Asp 


Val 


Lvs 


Phe 


Pro Gly 


Gly Gly 


15 








20 










25 


Gin 


He 


Val Gly 


Gly 


Val 


Tyr 


Val 


Leu 


Pro 


Arg Arg 


Gly Pro 




30 








35 








40 




Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser Glu 


Arg Ser 






45 








50 








. 55 


Gin 


Pro 


Arg Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys Ala 


Arg Arg 






60 










65 






70 


Ser 


Glu 


Gly Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr Pro 


Trp Pro 








75 










80 






Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly Trp 


Leu Leu 


85 








90 










95 




Ser 


Pro 


Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro Asn 


Asp Pro 




100 








105 








110 




Arg 


Arg 


Arg Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He Asp 


Thr Leu 






115 








120 








125 


Thr 


Cys 


Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He PrO/ 


Leu Val 






130 










135 




140 


Gly 


Ala 


Pro Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu Ala 


His Gly 








145 










150 






Val 


Arg 


Ala Leu 


Glu 


Asp 


Gly 


He 


Asn 


Phe 


Ala Thr Gly Asn 


155 








160 










165 




Leu 


Pro 


Gly Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu Ala 


Leu Phe 




170 








175 








180 




Ser 


Cys 


Leu Val 


His 


Pro 


Ala 


Ala 


Ser 










185 








190 











(2) INFORMATION FOR SEQ ID NO: 189: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unJcnown 

( D ) TOPOLOGY : unknown 



35 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
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(C) INDIVIDUAL ISOLATE: S2 



10 







(xi) 




SEQT 


JENC: 


Met 


Ser 


Thr 


Leu 


Pro 


Lys 


1 








5 




Thr 


He 


Arg 


Arg 


Pro 


Gin 


15 










20 


Gin 


He 


Val 


Gly Gly 


Val 




30 










Arg 


Leu Gly 


Val 


Arg 


Ala 






45 








Gin 


Pro 


Arg 


Gly Arg 


Arg 








60 






Ser 


Glu 


Gly 


Arg 


Ser 


Trp 










75 




Leu 


Tyr Gly 


Asn 


Glu 


Gly 


85 










90 


Ser 


Pro 


Arg 


Gly 


Ser 


Arg 




100 










Arg 


Arg 


Arg 


Ser 


Arg 


Asn 






115 








Thr 


Cys 


Gly 


Phe 


Ala 


Asp 








130 






Gly 


Ala 


Pro 


Val 


Gly 


Gly 










145 




Val 


Arg Ala 


Leu 


Glu 


Asp 


155 










160 


Leu 


Pro 


Gly 


Cys 


Ser 


Phe 




170 










Ser 


Cys 


Leu 


He 


His 


Pro 




185 









DESCRIPTION: SEQ ID NO: 189: 



Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


Asn 


Asp 


He 


Lys 


Phe 


Pro 
25 


Gly 


Gly 


Gly 


Tyr 


Val 


Leu 


Pro Arg 


Arg 


Gly 


Pro 


35 










40 






Thr 


Arg 
50 


Lys 


Thr 


Ser 


Glu 


Arg 
55 


Ser 


Gin 


Pro 


He 
65 


Pro 


Lys 


Ala 


Arg 


Arg 
70 


Ala 


Gin 


Pro 


Gly Tyr 


Pro 


Trp 


Pro 








80 










Cys 


Gly 


Trp 


Ala Gly 


Trp 


Leu 


Leu 










95 








Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 


105 










110 






Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 




120 








125 




Leu 


Met 


Gly 
135 


Tyr 


He 


Pro 


Leu 


Val 
140 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








150 








Gly 


He 


Asn 


Phe 


Ala 


Thr 


Gly 


Asn 










165 






Ser 


He 


Phe 


Leu 


Leu 


Ala' 


Leu 


Phe 


175 










180 






Ala 


Ala 
190 


Ser 













(2) INFORMATION FOR SKQ ID NO: 190: 



2< (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
30 (C) INDIVIDUAL ISOLATE: DK12 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 



Met Ser Thr Leu Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr He Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 
15 20 25 
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10 



30 



Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Val 


Leu 


Pro 


Arg Arg 


Gly 


Pro 




30 










35 








40 




Arg 


Leu 


Gly 
45 


Val 


Arg 


Ala 


Thr 


Arg 
50 


Lys 


Thr 


Ser Glu 


Arg 

55 


Ser 


Gin 


Pro 


Arg 


Gly 
60 


Arg 


Arg 


Gin 


Pro 


He 
65 


Pro 


Lys Ala 


Arg 


Arg 

70 


Ser 


Glu 


Gly 


Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr Pro 


Trp 


Pro 










75 










80 






Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly Trp 


Leu 


Leu 


85 










90 
















Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro Asn 


Asp 


Pro 




100 










105 








110 




Arg 


Arg 


Arg 
115 


Ser 


Arg 


Asn 


Leu 


Gly 
120 


Lys 


Val 


He Asp 


Thr 
125 


Leu 


Thr 


Cys 


Gly 


Phe 
130 


Ala 


Asp 


Leu 


Met 


Gly 
135 


Tyr 


He Pro 


Leu 


Val 
140 


Gly 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu Ala 


His 


Gly 










145 










150 






Val 


Arg 


Ala 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Phe 


Ala Thr 


Gly 


Asn 


155 










160 










165 




Leu 


Pro 
170 


Gly 


Cys 


Ser 


Phe 


Ser 
175 


He 


Phe 


Leu 


Leu Ala 
180 


Leu 


Phe 


Ser 


Cys 


Leu 


He 


His 


Pro 


Ala 


Ala 


Ser 












185 










190 













15 



(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQXJBNCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDBDNESS : unknown 
/\ (D) TOPOLOGY: unJcnown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hcHnosapiens 
(C) INDIVIDUAL ISOLATE: Z4 

25 



35 







(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 191: 


Met 


Ser 


Thr Asn 


Pro 


Lys 


Pro 


Gin 


Arg Lys 


Thr 


Lys 


Arg Asn 


1 






5 










10 








Thr 


Asn 


Arg Arg 


Pro 


Met 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 










25 






Gin 


He 


Val Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 








35 










40 




Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 






45 








50 










55 


Gin 


Pro 


Arg Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg Gin 






60 










65 








70 


Pro 


Glu 


Gly Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 








75 










80 








Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 








90 










95 
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o 



Ser 


Pro Arg Gly Ser Arg Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 


105 










110 






Arg 


Arg Arg 
115 


Ser Arg Asn Leu 


Gly 
120 


Lys 


Val 


He 


Asp 


Thr 
125 


Leu 


Thr 


Cys Gly 


Phe Ala Asp Leu 
130 


Met 


Gly 
135 


Tyr 


He 


Pro 


He 


Val 
140 


Gly 


Ala Pro 


Val Gly Gly Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 




145 






150 










Val 


Arg Ala 


Val Glu Asp Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 




160 








165 








Leu 


Pro Gly 
170 


Cys Ser Phe Ser 
175 


He 


Phe 


Leu 


Leu 


Ala 
180 


Leu 


Leu 


Ser 


Cys Leu 
185 


Thr Val Pro Ala 


Ser 
190 


Ala 













(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : untoown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 



25 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 






5 










10 










Thr 


Asn 


Arg 


Arg Pro 


Met 


Asp 


V^l 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 






20 










25 








Gin 


He 
30 


Val 


Gly Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro 


Arg 


Arg 
40 


Gly 


Pro 


Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 








50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 






60 . 








65 










70 


Ser 


Glu 


Gly 


Arg Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 






75 










80 










Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 






90 










95 








Ser 


Pro 


Arg 


Gly Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 






105 










110 






Arg 


Arg 


Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 




115 








120 










125 




Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 








135 










140 


Gly 


Ala 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 






145 










150 











wo 96/0531S 
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IS 



20 



25 



30 
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Val Arg Ala Val Glu Asp Gly lie Asn Tyr Ala Thr Gly Aan 

155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr Val Pro Ala Ser Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193; 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys Thr 


Lys 


Arg 


Asn 


1 






5 










10 








Thr 


Asn 


Arg 


Arg Pro 


Met 


Asp 


Val 


Lys 


Phe Pro 


Gly 


Gly 


Gly 


15 








20 








25 








Gin 


He 
30 


Val 


Gly Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro Arg 


Arg 
40 


Gly 


Pro 


Arg 


Leu 


Gly 
45 


Val Arg 


Ala 


Ala 


Arg 
50 


Lys 


Thr Ser 


Glu 


Arg 
55 


Ser 


Gin 


Pro 


Arg 


Gly Arg 
60 


Arg 


Gin 


Pro 


lie 
65 


Pro Lys 


Ala 


Arg 


Arg 
70 


Ser 


.Glu 


Gly 


Arg Ser 
75 


Trp 


Ala 


Gin 


Pro 


Gly^ Tyr 
80 


Pro 


Trp 


Pro 


Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala Gly 


Trp 


Leu 


Leu 


85 








90 








95 








Ser 


Pro 
100 


Arg 


Gly Ser 


Arg 


Pro 
105 


Ser 


Trp 


Gly Pro 


Asn 
110 


Asp 


Pro 


Arg 


Arg 


Arg 
115 


Ser Arg 


Asn 


Leu 


Gly 
120 


Lys 


Val He 


Asp 


Thr 
125 


Leu 


Thr 


Cys 


Gly 


Phe Ala 


Asp~ 


Leu 


Met 


Gly 


Tyr He 


Pro 


Leu 


Val 






130 








135 








140 


Gly 


Ala 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala Leu 


Ala 


His 


Gly 






145 










150 








Val 


Arg 


Ala 


Val Glu 


Asp 


Gly 


He 


Asn 


Tyr Ala 


Thr 


Gly 


Asn 


155 








160 








165 








Leu 


Pro 


Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


Leu Leu 


Ala 


Leu 


Leu 




170 




175 








180 






Ser 


Cys 


Leu 


Thr Thr 


Pro 


Ala 


Ser 


Ala 












185 








190 













35 



(2) INFORMATION FOR SEQ ID NO: 194: 
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PCTAJS95/10398 



- 213 - 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 



10 



15 



25 



Met 


Ser Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys Thr 


Lys 


Arg Asn 


1 






5 










10 








Thr 


Asn Arg 


Arg 


Pro 


Met 


Asp 


Val 


Lys 


Phe Pro 


Gly 


Gly Gly 


15 








20 








25 








Gin 


He Val 
30 


Gly 


Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro Arg 


Arg 
40 


Gly 


Pro 


Arg 


Leu Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr Ser 


Glu 


Arg 


Ser 




45 










50 








55 




Gin 


Pro Arg 


Gly 
60 


Arg 


Arg 


Gin 


Pro 


He 
65 


Pro Gin 


Ala 


Arg 


Arg 
70 


Ser 


Glu Gly Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly Tyr 


Pro 


Trp 


Pro 








75 










80 








Leu 


Tyr Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala Gly 


Trp 


Leu 


Leu 


85 








90 








95 








Ser 


Pro Arg 
100 


Gly 


Ser 


Arg 


Pro 
105 


Ser 


Trp 


Gly Gin 


Asn 
110 


Asp 


Pro 


Arg 


Arg Arg 
115 


Ser 


Arg 


Asn 


Leu 


Gly 
120 


Lys 


Val He 


Asp 


Thr 
125 


Leu 


Thr 


Cys Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr He 


Pro 


Leu 


Val 






130 




Gly^Val 




135 








140 


Gly 


Ala Pro 


Val 


Gly 


Ala 


Arg 


Ala Leu 


Ala 


His 


Gly 






145 






150 








Val 


Arg Ala 


Leu 


Glu 


Asp Gly 


He 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 








160 








165 








Leu 


Pro Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu Leu 


Ala 


Leu 


Phe 




170 






175 








180 






Ser 


Cys Leu 


Thr 


Thr 


Pro 


Ala 


Ser 


Ala 











_185 190 



(2) INFORMATION FOR SEQ ID NO: 195: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D ) TOPOLOGY : unknora 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z6 
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o 



10 



15 



20 



25 



30 







(xi) 


SEQUENCE DESCRIPTION: SEQ : 


ED NO: 195: 


Met 


Ser 


Thr 


Asn Pro 


Lvs 


Pro 


Gin 


Aro 


Lys 


Thr 


Lys 


At*ct 




1 






















Thr 


Asn 


Arg 


Arg Pro 


Met 


Asp 


Val 


Lys 


Phe 


Pro 


Glv 


Glv 


Gly 


15 








20 












Gin 


lie 


Val 


Gly Gly 


Val 


Tvr 


Leu 


Leu 


Pro 


At* CI 




Glv 


Pro 




30 








•a c 
J -J 










H u 






Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Gill 

V,7X U 


A T"fT 


o cx 






45 


















Do 




Gin 


Pro 


Arg 


Gly Arg 




Gin 


Pro 


Tie 




u jr 0 


AT a 


A irn 

Arg 


Arg 








D W 








fie? 










/ U 


Ser 


Glu 


Gly 


Arg Ser 


irp 






Pro 


t»iy 


Tyr 


Pro 


Trp 


Pro 








/ O 










on 








Leu 


Tyr 


Gly 




C 1 IT 

(jxy 


Cys 




Trp 


an 3 
AJ.a 


t*xy 


Trp 


Leu 


Leu 


85 


























Ser 


Pro 


Arg 


Gly Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 








105 










110 






Arg 


Arg 


Arg 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 








120 










125 




Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 








135 










140 


Gly 


Ala 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








145 










150 










Val 


Arg 


Ala 


Val Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 








160 










165 








Leu 


Pro 


Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 




175 










180 






Ser 


Cys 


Leu 


Thr Val 


Pro 


Thr 


Ser 


Ala 














185 








190 















35 



(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: eunino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr Asn Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly 

15 20 25 

Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 
30 35 40 
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0 



Arg 


Leu Gly Val Arg 


Thr Thr 


Arg 


Lys 


Thr Ser 


Glu 


Arg Ser 




45 




50 








55 


Gin 


Pro Arg Gly Arg 


Arg Gin 


Pro 


He 


Pro Lys 


Ala 


Arg Arg 




€0 






65 






70 


Ser 


Glu Gly Arg Ser 


Trp Ala 


Gin 


Pro 


Gly Tyr 


Pro 


Trp Pro 




75 








80 






Leu 


Tyr Gly Asn Glu 


Gly Cys 


Gly 


Trp 


Ala Gly 


Trp 


Leu Leu 


85 




90 






95 






Ser 


Pro Arg Gly Ser 


Arg Pro 


Ser 


Trp 


Gly Pro 


Asn 


Asp Pro 




100 


105 








110 




Arg 


Arg Arg Ser Arg 


Asn Leu 


Gly 


Lys 


Val He 


Asp 


Thr Leu 




115 




120 








125 


Thr 


Cys Gly Phe Ala 


Asp Leu 


Met 


Gly 


Tyr He 


Pro 


Leu Val 




130 






135 






140 


Gly 


Ala Pro Val Gly 


Gly Val 


Ala 


Arg 


Ala Leu 


Ala 


His Gly 


145 








150 






Val 


Arg Ala Leu Glu 


Asp Gly 


lie 


Asn 


Tyr Ala 


Thr 


Gly Asn 


155 




160 






165 






Leu 


Pro Gly Cys Ser 


Phe Ser 


He 


Phe 


Leu Leu 


Ala 


Leu Leu 




170 


175 








180 




Ser 


Cys Leu Thr Val 


Pro Ala 


Ser 


Ala 










185 




190 











(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 cunino acids 

(B) TYPE: aitiino acid 

20 (C) STRANDEDNESS : unknown 

{ D ) TOPOLOGY : unJcnown 

■ 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK13 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 



30 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


-1 






5 










10 








Thr 


Asn 


Arg 


Arg Pro 


Met 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 




20 










25 






Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 






35 










40 




Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 




45 








50 










55 


Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg Gin 






60 








.65 








70 


Leu 


Glu 


Gly 


Arg Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 






75 










80 








Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 




90 










95 
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Ser 


Pro 
100 


Arg 


Gly 


Ser 


Arg 


Pro 
105 


Ser 


Trp 


Gly 


Pro 


Asn 
110 


Asp 


Pro 


Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


lie Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 
130 


Ala 


Asp 


Leu 


Met 


Gly 
135 


Tyr 


He 


Pro 


Val 


Val 
140 


Gly 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Leu 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 










165 








Leu 


Pro 
170 


Gly 


Cys 


Ser 


Phe 


Ser 
175 


He 


Phe 


Leu 


Leu 


Ala 
180 


Leu 


Leu 


Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Ser 


Ala 














185 










190 















10 

(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS : unknown 
j5 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 



25 



30 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 






5 










10 










Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 








20 










25 








Gin 


He 
30 


Val 


Gly Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro 


Arg 


Arg 
40 


Gly 


Pro 


Arg 


Leu 


Gly 
45 


Val Arg 


Ala 


Thr 


Arg 
50 


Lys 


Thr 


Ser 


Glu 


Arg 
55 


Ser 


Gin 


Pro 


Arg 


Gly Arg 
60 


Arg 


Gin 


Pro 


He 
65 


Pro 


Lys 


Ala 


Arg 


Gin 
70 


Pro 


Thr 


Gly 


Arg Ser 
75 


Trp 


Gly 


Gin 


-Pro 


Gly 
80 


Tyr 


Pro 


Trp 


Pro 


Leu 


Tyr 


Ala 


Asn Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 








90 










95 








Ser 


Pro 
100 


Arg 


Gly Ser 


Arg 


Pro 
105 


Asn 


Trp 


Gly 


Pro 


Asn 
110 


Asp 


Pro 


Arg 


Arg 


Lys 
115 


Ser Arg 


Asn 


Leu 


Gly 
120 


Lys 


Val 


He 


Asp 


Thr 
125 


Leu 


Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 








135 










140 


Gly 


Gly 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 




145 










150 










Val 


Arg 


Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 






160 










165 
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Leu Pro Gly Cys Ser Phe Ser lie Phe lie Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr Val Pro Ala Ser Ala 
185 190 



5 (2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unJcnown 

0 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 



25 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr Lys Arg 


Asn 


1 






5 










10 






Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro Gly Gly 


Gly 


15 




20 










25 




Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg Arg Gly 


Pro 




30 






35 








40 




Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser Glu Arg 


Ser 




45 








50 






55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys Ala Arg 


Gin 






60 








65 






70 


Pro 


Thr 


Gly 


Arg Ser 


Trp 


Gly 


Gin 


Pro 


Gly 


Tyr Pro Trp 


Pro 






.A 75 
Asn Glu 










80 


A 

Gly Trp Leu 




Leu 


Tyr 


Ala 


Gly 


Leu 


Gly 


Trp 


Ala 


Leu 


85 






90 










95 




Ser 


Pro 


Arg 


Gly Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro Asn Asp 


Pro 




100 




105 








110 




Arg 


Arg 


Lys 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He Asp Thr 


Leu 


115 








120 






125 




Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He Pro Leu 


Val 




130 








135 






140 


Gly 


Gly 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu Ala His 


Gly 




145 










150 






Val 


Arg 


Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala Thr Gly 


Asn 


155 






160 










165 




Leu 


Pro 


Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


He 


Leu Ala Leu 


Leu 




170 




175 








180 




Ser 


Cys 


Leu 


Thr Val 


Pro 


Ala 


Ser 


Ala 










185 








190 











35 



(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 



W0 96A)5315 



PCT/US95/103^ 



10 



15 



20 



218 



(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY : un]cnown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA7 





(xi) 


SEQUENCE DESCRIPTION: SEQ : 


ID NO: 200: 


Met 


Ser Thr 


no 11 c^^sj 


T*VQ 




m n 

V7X11 


Arg 


Lys Thr 


Lys 


Arg 


Asn 


1 




c 










10 




Thr 


Asn Arg 




m n 




Velx 


Lys 


Phe Pro 


Gly 




Gly 


15 














25 




Gin 


He Val 


vaXy \s±y 


vax 


iyr 


Leu 


Leu 


Pro Arg Arg 


Gly 


Pro 




30 






"a c: 
J D 








40 




Arg 


Leu Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr Ser 


Glu 


Arg 


Ser 




45 








50 








55 




Gin 


Pro Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro Lys 


Ala 


Arg 


Gin 






60 








65 






70 


Pro 


Thr Gly 


Arg Ser 


Trp 


Gly 


Gin 


Pro 


Gly Tyr 


Pro 


Trp 


Pro 






75 










80 






Leu 


Tyr Ala 


Asn Glu 


Gly 


Leu 


Gly 


Trp 


Ala Gly Trp 


Leu 


Leu 


85 






90 








95 








Ser 


Pro Arg 


Gly Ser 


Arg 


Pro 


Asn 


Trp 


Gly Pro 


Asn 


Asp 


Pro 




100 






105 








110 




Arg 


Arg Lys 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val He 


Asp 


Thr 


Leu 




115 








120 








125 




Thr 


Cys Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr He 


Pro 


Leu 


Val 






130 








135 








140 


Gly 


Gly Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala Leu 


Ala 


His 


Gly 






145 










150 






Val 


Arg Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr Ala 


Thr 


Gly 


Asn 


155 






160 








165 






Leu 


Pro Gly 


Cys Ser 


Phe 


Ser 


He 


Phe 


He Leu 


Ala 


Leu 


Leu 




170 






175 








180 






Ser 


Cys Leu 


Thr Val 


Pro 


Ala 


Ser 


Ala 












185 








190 













25 



(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
3j (C) INDIVIDUAL ISOLATE: SAl 



wo 96/05315 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 



10 



Met 


Ser Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


1 




5 










10 








Thr 


Asn Leu 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 






20 










25 






Gin 


He Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 






35 










40 




Arg 


Leu Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 




45 








50 










55 


Gin 


Pro Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg Gin 






60 








65 








70 


Pro 


Thr Gly 


Arg Ser 


Trp 


Gly 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 






75 










80 








Leu 


Tyr Ala 


Asn Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 






90 










95 






Ser 


Pro Arg 


Gly Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Asn 


Asp Pro 




100 






105 










110 




Arg 


Arg Lys 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr Leu 




115 








120 










125 


Thr 


Cys Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu Val 






130 








135 








140 


Gly 


Gly Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His Gly 






145 










150 








Val 


Arg Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 






160 










165 






Leu 


Pro Gly 


Cya Ser 


Phe 


Ser 


He 


Phe 


He 


Leu 


Ala 


Leii Leu 




170 






175 










180 




Ser 


Cys Leu 


He He 


Pro 


Ala 


Ser 


Ala 











185 190 



(2) INFORMATION FOR SEQ ID NO: 202: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA3 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 

15 20 25 

Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 
35 30 35 40 
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10 



IS 



20 



25 



30 



Arg 


Leu Gly Val Arg 


Ala Thr Arg 


Lys 


Thr Ser 


Glu 


Arg 


Ser 




45 


50 








55 




Gin 


Pro Arg Gly Arg 


Arg Gin Pro 


He 


Pro Lys 


Ala 


Arg 


Gin 




60 




65 








70 


Pro 


Thr Gly Arg Ser 


Trp Gly Gin 


Pro 


Gly Tyr 


Pro 


Trp 


Pro 




75 






80 








Leu 


Tyr Ala Asn Glu 


Gly Leu Glu 


Trp 


Ala Gly. 


Trp 


Leu 


Leu 


85 




90 




95 








Ser 


Pro Arg Gly Ser 


Arg Pro Ser 


Trp 


Gly Pro 


Asn 


Asp 


Pro 




100 


105 






110 






Arg 


Arg Lys Ser Arg 


Asn Leu Gly 


Lys 


Val He 


Asp 


Thr 


Leu 




115 


120 








125 




Thr 


Cys Gly Phe Ala 


Asp Leu Met 


Gly 


Tyr He 


Pro 


Leu 


Val 




130 




135 








140 


Gly 


Gly Pro Val Gly 


Gly Val Ala 


Arg 


Ala Leu 


Ala 


His 


Gly 




145 






150 








Val 


Arg Val Leu Glu 


Asp Gly Val 


Asn 


Tyr Ala 


Thr 


Gly 


Asn 


155 




160 




165 








Leu 


Pro Gly Cys Ser 


Phe Ser lie 


Phe 


He Leu 


Ala 


Leu 


Leu 




170 


175 






180- 






Ser 


Cys Leu Thr Val 


Pro Ala Ser 


Ala 












185 


190 













(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ^ORIGINAL SOURCE: ^\ 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203; 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 






5 










10 










Thr 


Asn 


Arg. 


Arg Pro 


Gin 


Asp- 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 








20 










25 








Gin 


He 
30 


Val 


Gly Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro 


Arg 


Arg 
40 


Gly 


Pro 


Arg 


Leu 


Gly 
45 


Val Arg 


Ala 


Thr 


Arg 
50 


Lys 


Thr 


Ser 


Glu 


Arg 
55 


Ser 


Gin 


Pro 


Arg 


Gly Arg 
60 


Arg 


Gin 


Pro 


He 
65 


Pro 


Lys 


Ala 


Arg 


Gin 
70 


Pro 


Thr 


Gly 


Arg Ser 
75 


Trp 


Gly 


Gin 


Pro 


Gly 
80 


Tyr 


Pro 


Trp 


Pro 


Leu 


Tyr 


Ala 


Asn Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 








90 










95 








Ser 


Pro 
100 


Arg 


Gly Ser 


Arg 


Pro 
105 


Asn 


Trp 


Gly 


Pro 


Asn 
110 


Asp 


Pro 



wo 96/05315 
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Arg Arg Lys Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu 

115 120 125 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val 
130 135 140 

Gly Gly Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe lie Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr Val Pro Thr Ser Ala 
185 190 



10 (2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D) TOPOLOGY: unknown 



15 



20 



25 



30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 



35 



Met 


Ser 


Thr 


Asn Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Gin 


Arg 


Asn 


1 






5 








10 










Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


^ 15 






20 






r\ 




25 








Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 






35 










40 






Arg 


Met 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 








50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 






60 








65 










70 


Ser 


Ala 


Gly 


Arg Ser 


Trp 


Gly 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 






75 










80 










Leu 


Tyr 


Ala 


Asn Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 






90 










95 








Ser 


Pro 


Arg 


Gly Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 




105 










110 






Arg 


Arg 


Lys 


Ser Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


115 








120 










125 




Thr 


Cys 


Gly 


Phe Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 








135 










140 


Gly 


Gly 


Pro 


Val Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 




145 










150 










Val 


Arg 


Val 


Leu Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 



155 160 
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Leu Pro Gly Cys Ser Phe Ser lie Phe Val Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr Val Pro Ala Ser Ala 
185 190 



5 (2) INFORMATION FOR SEQ ID NO: 205: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAll 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 



15 



25 



Met 


Ser 


Thr Asn 


1 






Thr 


Asn 


Arg Arg 


15 






Gin 


He 


Val Gly 




30 




Arg 


Leu 


Gly Val 






45 


Gin 


Pro 


Arg Gly 






60 


Pro 


Thr 


Gly Arg 






A 


Phe 


Tyr 


Ala Asn 


85 






Ser 


Pro 


Arg Gly 




100 




Arg 


Arg 


Arg Ser 






115 


Thr 


Cys 


Gly Phe 






130 


Gly 


Gly 


Pro Val 


Val 


Arg 


Ala Leu 


155 






Leu 


Pro 


Gly Cys 




170 




Ser 


Cys 


Leu Thr 




185 



Pro 
5 


Lys 


Pro 


Gin 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 




20 








Gly 


Val 


Tyr 


Leu 


Leu 






35 






Arg 


Ala 


Thr 


Arg 


Lys 








50 




Arg 


Arg 


Gin 


Pro 


He 










€5 


Ser 


Trp 


Gly 


Gin 


Pro 


75 










Glu 


Gly 


Leu 


Gly 


Trp 




90 








Ser 


Arg 


Pro 


Asn 


Trp 






105 






Arg 


Asn 


Leu 


Gly 


Lys 








120 




Ala 


Asp 


Leu 


Met 


Gly 










135 


Gly 


Gly 


Val 


Ala 


Arg 


145 










Glu 


Asp 


Gly 


Val 


Asn 




160 








Ser 


Phe 


Ser 


He 


Phe 






175 






Val 


Pro 


Ala 


Thr 


Ala 



190 



Lys 


Thr Lys 


Arg Asn 


10 






Phe 


Pro Gly 


Gly Gly 




25 




Pro 


Arg Arg 


Gly Pro 




40 




Thr 


Ser Glu 


Arg Ser 






55 


Pro 


Lys Ala 


Arg Gin 






70 


Gly 


Tyr Pro 


Trp Pro 


80 


/A 




Ala 


Gly Trp 


Leu Leu 




95 




Gly 


Pro Asn 


Asp Pro 




110 




Val 


He Asp 


Thr Leu 




125 


Tyr 


He Pro 


Leu Val 




140 


Ala 


Leu Ala 


His Gly 


150 




Tyr 


Ala Thr 


Gly Asn 




165 




He 


Leu Ala 


Leu Leu 




180 





35 



(2) 



INFORMATION FOR SEQ ID NO: 206: 
(i) SEQUENCE CHARACTERISTICS: 



WOM/0S315 



PCT/US95/10398 



10 



15 



20 



223 



(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK2 





(xi) 


SEQUENCE DESCRIPTION: SEQ : 


ID NO: 206: 


Met 


Ser Thr Leu 




Jjys 






Arg 


T.vra 
JjyD 






nX^^ iisn 


1 




5 










10 








Thr 


Asn Arg Arg 


Pro 


Tnr 


Asp 


vai 


Lys 


Fne 


Pro 


v»jLy 


vjiy Gly 


15 






20 










25 






Gin 


He Val Gly 


Gxy 


vai 


Tyr 


Leu 


jjeu 


Pro 


Arg 


Arg 


Giy pro 




30 






1 c 
JD 










40 




Arg 


Leu Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 




45 








50 










55 


Gin 


Pro Arg Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg Gin 




60 










65 








70 


Pro 


Gin Gly Arg 


His 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 






75 










80 








Leu 


Tyr Gly Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 






90 










95 






Ser 


Pro Arg Gly 


Ser 


Arg 


Pro 


His 


Trp 


Gly 


Pro 


Asn 


Asp Pro 




100 






105 










110 




Arg 


Arg Arg Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr Leu 




115 








120 










125 


Thr 


Cys Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val Val 




130 










135 








140 


Gly 


Ala Pro Leu 


Gly 


Gly^ 


Val 


Ala 


Ala 


Ala 


Leu 


Ala 


His Gly 




145 








150 








Val 


Arg Ala He 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 




160 










165 






Leu 


Pro Gly Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu Leu 




170 






175 










180 




Ser 


Cys Leu Thr 


Thr 


Pro 


Ala 


Ser 


Ala 












185 








190 













25 



(2) INFORMATION FOR SEQ ID NO: 2 07: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:207: 

2j GCGTCCGGGT TCTGGAAGAC GGCGTGAACT ATGCAACAGG 



W0 96A>5315 
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10 



- 224 - 



(2) INFORMATION FOR SBQ ID NO: 2 08: 

(i) SEQUENCE CHARACTERISTICS: 

{A] LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:208: 

AGGCTTTCAT TGCAGTTCAA GGCCGTGCTA TTGATGTGCC 

(2) INFORMATION FOR SEQ ID NO:209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:209: 

AAGACGGCGT GAACTATGCA ACAGGGAACC TTCCTGGTTG 

(2) INFORMATION FOR SEQ ID NO: 210: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 
A (B) TYPE: nucleic a)cid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 

AGTTCAAGGC CGTGCTATTG ATGTGCCAAC TGCCGTTGGT 

(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:211: 

AAGACGGCGT GAATTCTGCA ACAGGGAACC TTCCTGGTTG 
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o 

(2) INFORMATION FOR SEQ ID NO: 2 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(xi). SEQUENCE DESCRIPTION: SEQ ID NO:212: 

A6TTCAAGGC CGTGGAATTC ATGTGCCAAC TGCCGTTGGT 



(2) INFORMATION FOR SEQ ID NO: 2 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:213: 

ARCTYCGACG TYACATCGAY CTGCTYGTYG GRAGYGCCAC CC 



(2) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:214: 

RCARGCCRTC TTGGAYATGA TCGCTGGWGC Y 

(2) INFORMATION FOR SEQ ID NO: 2 15 : ~ 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 

CRATACGACR YCAYGTCGAY TTGCTCGTTG GGGCGGCTRY YT 



35 



(2) INFORMATION FOR SEQ ID NO: 216: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216; 

RCAAGCTRTC RTGGAYRTGG TRRCRGGRGC C 

(2) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

TTGCGGACKC ACATYGACAT GGTYGTGATG TCCGCCACGC 

(2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:218: 

GATGCGCGTT CCCGAGGTCA TCWTAGACAT CRTYRGCGGR GCD 

(2) INFORMATION FOR SEQ ID NO: 219: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 

AATGGCACCY TGCRCTGCTG GATACAAGTR ACACCTAATG TGGCTGTGAA 
ACAC 



35 



(2) INFORMATION FOR SEQ ID NO: 220: 



WO9«/05315 
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15 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 

TGARCTAGYC CTYSARGTYG TCTTCGGYGG Y 

(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 221: 

GCCAACGTCT CTCGATGTTG GGTGCCGGTT GCCCCCAATC TCGCCATAAG 
TCAA 

(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE): nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:222: 

25 AAGGGCCTGC GAGCACACAT CGATATCATC GTGATGTCTG CTACGG 

(2) INFORMATION FOR SEQ ID NO: 22 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 3: 

TTGGTGCGCA TCCCGGAAGT CATCTTGGAT ATTGTTACAG GAGGT 



20 
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(2) INFORMATION FOR SEQ ID NO: 224: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 

A6TCAGGTAY GTCGGA6CAA CCACCGCYTC GATACGCAGT 



(2) INFORMATION FOR SEQ ID NO: 22 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:225: 

AGCCTTCACG TTCAGACCKC GTCGCCATCA AACRGTCCAG ACCTGT 



(2) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 
20 (B) TYPE: nucleic acid 

f\ (C) STRANDEDNESS: ,single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:226: 
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TCCCCCGCYG TGGGTATGGT GGTRGCGCAC RTYCTGCGDY TGCCCCAGAC 
CKTGTTYGAC ATAMTRGCYG GGGCC 



(2) INFORMATION FOR SEQ ID NO: 227: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 7: 

ACGCCGGTGA CGCCTACAGT GGCTGTCGCA CACCCGGGC 



(2) INFORMATION FOR SEQ ID NO: 22 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 

ATGAGGGTCC CCACAGCCTT TCTCGACATG GTTGCCGGAG GC 



(2) INFORMATION FOR SEQ ID NO: 229: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 

CGCGCCCTAT CCCAACGCAC CGTTAGAGTC CATGCGCAGG 



(2) INFORMATION FOR SEQ ID NO: 230: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 

TCAGATCTTA CGGATCCCCT CTATCCTAGG TGACTTGCTC ACCGGGGGT 



25 



(2) INFORMATION FOR SEQ ID NO: 231; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 



CAGTCACGCT GCTGGGTGGC CCTTACTCCC ACCGTGGCGG YGYCTTATAT 
CGGT 
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(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:. 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:232: 

TAGCACTCTG GTRGAYCTAC TCRCTGGAGG G 



(2) INFORMATION FOR SEQ ID NO: 23 3: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:233: 

15 

AAGTCTACAT GCTGGGTGTC TCTCACCCCC ACCGTGGCTG CGCAACATCT 
GAAT 



10 
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(2) INFORMATION FOR SEQ ID NO: 2 34: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 31 base pairs 

^^ (B) TYPE: nucleic^ acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:234: 

25 AGGCGCCATG GTCGACCTGC TTGCAGGCGG C 

(2) INFORMATION FOR SEQ ID NO:235: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:235: 



TCAGCCCCGA VYYTCGGAGC GGTCACGGCT CCTCTTCGGA GGG 
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(2) INFORMATION FOR SEQ ID NO: 23 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:236: 

TGYTACGGAT YCCCCARGTG GTCATHGACA TCATWGCCGG GGSC 44 

(2) INFORMATION FOR SEQ ID NO: 2 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 237: 

CATACCAAAT GCTTCCACGC CCGCAACGGG ATTCCGCAGG 40 



(2) INFORMATION FOR SEQ ID NO: 23 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 
20 (B) TXPE: nucleic acid 

(C) STRANDEDNESS: single f'^ 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:238: 

TCTTCTTGCG GGCGCCGCAG TGGTTTGCTC ATCCCTG 37 
. ( 2 ) INFORMATION FOR SEQ ID NO : 239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 239: 

ATCTAGCATC TTGAGGGTAC CTGAGATTTG TGCGAGTGTG ATATTTGGTG 50 
GC 52 



35 
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(2) INFORMATION FOR SEQ ID NO: 240: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:240: 

Trp lie Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala 

5 10 15 

Leu Thr His Asn Leu Arg Xaa His Xaa Asp Xaa lie Val Met Ala 
.20 25 30 

Ala Thr Val 

(2) INFORMATION FOR SEQ ID NO: 241: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
15 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 

Trp Val Pro Val Ala Pro Asn Leu Ala lie Ser Gin Pro Gly Ala 

5 10 15 

Leu Thr Lys Gly Leu Arg Ala His lie Asp lie lie Val Met Ser 

20 25 30 

20 Ala Thr Val 

r\ 

(2) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS: unknovm 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:242: 

Trp lie Pro Val Xaa Pro Asn Val Ala Val Xaa Xaa Pro Gly Ala 

5 10 15 

30 Leu Thr Gin Gly Leu Arg Thr His lie Asp Met Val Val Met Ser 

20 25 30 

Ala Thr Leu 



35 



(2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:243: 

Trp Thr Xaa Val Thr Pro Thr Val Ala Val Arg Tyr Val Gly Ala 

5 10 15 

Thr Thr Ala Ser lie Arg Ser His Val Asp Leu Leu Val Gly Ala 

20 25 30 

Ala Thr Xaa 

(2) INFORMATION FOR SEQ ID NO: 244: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

^5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 

Trp Val Ala Leu Xaa Pro Thr Leu Ala Ala Arg Asn Xaa Xaa Xaa 

5 10 15 

Xaa Thr Xaa Xaa lie Arg Xaa His Val Asp Leu Leu Val Gly Ala 

20 25 30 

Ala Xaa Phe 

20 

(2) INFORMATION FOR SEQ ID NO: 245: r\ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acid« 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 245: - 

Trp Val Xaa Xaa Xaa Pro Thr Val Ala Thr Arg Asp Gly Lys Leu 

5 10 15 

Pro Xaa Xaa Gin Leu Arg Arg Xaa lie Asp Leu Leu Val Gly Ser 

20 25 30 

30 Ala Thr Leu 

(2) INFORMATION FOR SEQ ID NO: 246: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
35 (D) TOPOLOGY: unknown 



25 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 246: 

Trp Thr Pro Val Thr Pro Thr Val Ala Val Ala His Pro Gly Ala 

5 10 15 

Pro Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala 

5 20 25 30 
Ala Thr Leu 



(2) INFORMATION FOR SEQ ID NO: 247: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:247: 
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Trp Val Ala Leu Thr Pro Thr Val Ala Xaa Xaa Tyr He Gly Ala 

5 10 15 

Pro Leu Xaa Ser Xaa Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Val 



(2) INFORMATION FOR SEQ ID NO: 248; 



20 (i) SEQUENCE CHARACTERISTICS: 

fA (A) LENGTH: 33 /amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 8: 

Trp Val Ser Leu Thr Pro Thr Val Ala Ala Gin His Leu Asn Ala 

-5 10 - 15 

Pro Leu Glu Ser Leu Arg Arg His Val Asp Leu Met Val Gly Gly 

20 25 30 

Ala Thr Leu 



30 (2) INFORMATION FOR SEQ ID NO: 249: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249: 

Trp Val Pro Leii Thr Pro Thr Val Ala Ala Pro Tyr Pro Asn Ala 

5 .10 15 

Pro Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Met 

5 

(2) INFORMATION FOR SEQ ID NO: 2 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A} LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
10 ( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 

Trp Val Xaa lie Thr Pro Thr Leu Ser Ala Pro Xaa Xaa Gly Ala 

5 10 15 

Val Thr Ala Pro Leu Arg Arg Xaa Val Asp Tyr Leu Ala Gly Gly 

20 25 30 

Ala Ala Leu 

(2) INFORMATION FOR SEQ ID NO: 251: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: unknown 

ilS) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 

Trp His Ala Val Thr Pro Thr Leu Ala lie Pro Asn Ala Ser Thr 

5 10 15 

Pro Ala Thr Gly Phe Arg Arg His Val Asp Leu Leu Ala Gly Ala 

20 25 30 

Ala Val Val 



25 



(2) INFORMATION FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 

Thr Leu Thr Met lie Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu 

5 10 15 

Xaa Leu Xaa Val Val Phe 61y Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 2 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknovm 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:253: 

Thr Thr Thr Met Leu Leu Ala Tyr Leu Val Arg lie Pro Glu Val 

5 10 15 

He Leu Asp He Val Thr Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 254: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknoura 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 254: 

Thr Xaa Thr Xaa He Leu Ala Tyr Xaa Met Arg Val Pro Glu Val 

5 10 15 

He Xaa Asp He Xaa Xaa Gly Ala 

20 



(2) INFORMATION FOR SEQ ID NO: 255: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknoim 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255: 

Ala Val Gly Met Val Val Ala His Xaa Leu Arg Leu Pro Gin Thr 

5 10 15 

Xaa Phe Asp He Xaa Ala Gly Ala 

20 
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(2) INFORMATION FOR SEQ ID NO: 25 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:256: 

Thr Xaa Ala Leu Val Xaa Ser Gin Leu Leu Arg Xaa Pro Gin Ala 

S 10 15 

Xaa Xaa Asp Xaa Val Xaa 61y Ala 

20 



(2) INFORMATION FOR SEQ ID NO: 257; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

}5 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 257: 

Thr Xaa Ala Leu Val Xaa Ala Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 15 

Xaa Leu Asp Met lie Ala Gly Ala 
20 20 

A A 

(2) INFORMATION FOR SEQ ID NO: 258: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:258: 

Thr Thr Thr Leu Leu Leu Ala Gin lie Met Arg Val Pro Thr Ala 

5 10 15 

30 Phe Leu Asp Met Val Ala Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 259: 

(i) . SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 eimino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:259: 

Thr Thr Thr Leu Xaa Leu Ala Gin Val Met Arg lie Pro Ser Thr 

5 10 15 

Leu Val Asp Leu Leu Xaa Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 260: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 260: 

Thr Ala Thr Leu Val Leu Ala Gin Leu Met Arg lie Pro Gly Ala 
15 5 10 15 

Met Val Asp Leu Leu Ala Gly Gly 

20 



(2) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 23 amino acids 

f'^ (B) TYPE: ^amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

25 Thr Ser Ala Leu lie Met Ala Gin lie Leu Arg lie Pro Ser lie 

5 10 15 

Leu Gly Asp" Leu Leu Thr Gly Gly - _ 

20 

(2) INFORMATION FOR SEQ ID NO: 2 62: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: cuhino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262; 



35 



wo 96/05315 



PCT/US9S/10398 



10 



- 239 - 

Xaa Thr Ala Leu Xaa Met Ala Gin Xaa Leu Arg lie Pro Gin Val 

5 10 15 

Val He Asp He He Ala Gly Xaa 

20 

(2) INFORMATION FOR SEQ ID NO: 2 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unlcnown 

(D) TOPOLOGY: unJoiown 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:263: 

Thr Thr Thr Leu Val Leu Ser Ser He Leu Arg Val Pro Glu He 

5 10 15 

Cys Ala Ser Val He Phe Gly Gly 

20 
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CIiAIMS 

1, A purified and isolated DNA having a 
sequence selected from the group consisting of SEQ ID N0:1 - 
through SEQ ID NO: 51, 

5 

2. A purified and isolated protein encoded by a 
gene whose sequence includes a sequence selected from the 
group consisting of SEQ ID N0:52 through SEQ ID NO: 102. 
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3. A purified and isolated DNA having a 
sequence selected from the group consisting of SEQ ID NO: 
103 through SEQ ID NO: 154, 

4. A purified and isolated protein encoded by a 
gene sequence selected from the group consisting of SEQ ID 
NO: 155 through SEQ ID NO: 206, 

5. A purified and isolated protein having an 
amino acid sequence selected from the group consisting of 
SEQ ID NO: 52 through SEQ ID NO: 102 and SEQ ID NO: 155 
through SEQ ID NO: 206. ^ 

6. A method for the recombinant DNA-directed 
synthesis of a protein, said method comprising: 

culturing a transformed or trans fee ted host 
organism containing a DNA sequence capable 
of directing the host organism to produce 
said protein under conditions such that the 
protein is produced, said protein exhibiting 
substantial homology to a protein comprising 
the amino acid sequence selected from the 
group consisting of SEQ ID NO: 52 through SEQ 
ID NO: 102 or SEQ ID NO: 155 through SEQ ID 
NO:206. 



35 



7. The method of claim 6, wherein the host 
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organism is transf ected with a recombinant eukaryotic 
expression vector. 

8. The method of claim 7, wherein the host 
organism is a eukaryotic cell. 

9. A recombinant expression vector comprising a 
DNA sequence selected from the group consisting of SEQ ID 
N0:1 through SEQ ID N0:51 and SEQ ID NO:103 through SEQ ID 
NO:154. 

10. A host organism transformed or transf ected 
with a recombinant expression vector according to claim 9. 

11. A method of detecting antibodies against 
15 HCV, said method comprising: 

(a) contacting a biological sample .with at 
least one protein of claim 5 to form an 
immune complex with the antibodies; and 

(b) detecting the presence of the immune 
20 complex. 

r\ ■ /A - 

12. The method of claim 11 wherein the 
biological sample is selected from the group consisting of 
serum, saliva or lymphocytes or other mononuclear cells. 

25 

13. ^ The method of claim 11, wherein the 
recombinant protein is bound to a solid support. 

14. The method of claim 11, wherein the immune 
30 complex is detected using a labeled antibody. 

15. A hepatitis C virus kit comprising: at least 
one protein conprising an amino acid sequence selected from 
the group consisting of: SEQ ID NO: 52 through SEQ ID NO: 102 

35 and SEQ ID NO: 155 through SEQ ID NO: 206. 
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16. A composition comprising at least one 
recombinant protein of claim 5 and an excipient, diluent or 
carrier. 

17. A composition comprising an expression 

^ vector capable of directing host organism synthesis of a 
protein having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 52 through SEQ ID NO: 102 
and SEQ ID NO: 155 through SEQ ID NO: 206. 

^® 18. A method of preventing hepatitis C 

infection, comprising administering the composition of 
claim 16 or 17 to a mcuttmal in an effective amount to 
stimulate the production of protective antibody. 

^5 19- A vaccine for immunizing a mammal against 

hepatitis C infection, conqprising at least one protein 
according to claim 5 in a pharmacologically acceptable 
carrier. 

20 20. A vaccine for immunizing a mammal against 

hepatitis C infection, said vaccine comprising an ^ 
expression vector capable of directing host organism 
synthesis of a protein having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 52 - SEQ ID 

25 NO:102 and SEQ ID N0:155 - SEQ ID NO:206. 

21. A method for detecting the presence of the 
hepatitis C virus via a reverse transcription -polymerase 
chain reaction, said method comprising amplifying an HCV 

30 reverse transcription product by polymerase chain reaction 
using universal primers. 

22. The method of claim 21, wherein said 
universal primers are deduced from universally conserved 

35 nucleotide domains found in SEQ ID NO: 1 through SEQ ID NO: 
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51, in SEQ ID NO: 103 through SEQ ID NO: 154, or in 
consensus sequences shown in Figures lA-H and 6A-K. 

23. Substantially isolated and purified 
universal primers, wherein said primers have nucleic acid 
sequences derived from universally conserved nucleotide 
domains found in SEQ ID NOrl through SEQ ID NO: 51, in SEQ 
ID NO: 103 through SEQ ID NO: 154 and in consensus sequences 
showing Figures lA-H and 6A-K. 

24. A diagnostic kit for use in detecting the 
presence of hepatitis C virus in a biological sample, said 
kit comprising at least two universal primers according to 
claim 22 . 

25. A diagnostic kit for use in detecting the 
presence of hepatitis C virus is a biological sample, said 
kit comprising at least one nucleic acid sequence selected 
from the group consisting of SEQ ID No: 1-51 or SEQ ID 

No: 103 -154. 

26. A method for determining the genotype of a 
hepatitis C virus, said method comprising: 

cunplifying reverse trsmscription 
products of RNA via polymerase chain 
reaction using genotype- specif ic 
ait5)lif ication primers deduced from 
genotype- specific nucleotide domains 
found in SEQ ID NO :l through SEQ ID 
NO: 51, in SEQ ID NO: 103 through SEQ ID 
NO: 154, or in consensus sequences shown 
in Figures lA-H and 6A-K. 



15 
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27. A method for determining the genotype of a 
hepatitis C virus, said method comprising: 

(a) conplifying RNA via reverse 
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transcription-polymerase chain reaction - 
to produce amplification products; 

(b) contacting said products with at least 
one sequence shown in SEQ ID N0:1 
through SEQ ID NO: 51 and SEQ ID NO: 103 

^ through SEQ ID NO: 154; and 

(c) detecting complexes of said product 
which bind to said nucleic acid 
sequence. 

^® 28, A method for determining the genotype of a 

hepatitis C virus, said method comprising: 

(a) amplifying RNA via reverse 

transcription-polymerase chain reaction 
to produce amplification products; 
^5 (b) contacting said products with at least 

one genotype- specif ic oligonucleotide; 
and 

(c) detecting complexes of said products 
which bind to said oligonucleotide (s) . 

20 

29. The method of claims 27 or 28, wherein^ said 
amplification of step (a) uses universal primers deduced 
from universally conserved nucleotide domains found in SEQ 
ID N0:1 through SEQ ID NO: 51, in SEQ ID NO: 103 through SEQ 

25 ID NO: 154, or in consensus sequences shown in Figures lA-H 

and 6A-K. _ . 

30. The method of claim 28, wherein said 
genotype- specific oligonucleotide of step (b) is a nucleic 

30 acid sequence deduced from genotype -specific nucleotide 

domains found in SEQ ID N0:1 through SEQ ID NO: 51 and SEQ 
ID NO: 103 through SEQ ID NO: 154, or in consensus sequences 
shown in Figures lA-H and 6A-K. 

35 31. Substantially isolated and purified 
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genotype- specif ic oligonucleotides, wherein said 
oligonucleotides have nucleic acid sequences deduced from 
genotype- specif ic nucleotide domains found in SEQ ID NO:l 
through SEQ ID NO: 51, in SEQ ID NO: 103 through SEQ ID 
NO: 154, or in consensus sequences shown in Figures lA-H and 
6A-K. 



32. Substantially purified and isolated 
genotype -specific peptides having amino acid sequences 
deduced from a genotype -specific amino acid domains located 

10 in SEQ ID NO:52 through SEQ ID N0:102, in SEQ ID NO:155 

through SEQ ID NO: 206, or in consensus sequences shown in 
Figures 2A-H and 7A-K. 

33. A method of detecting antibodies specific 
15 for a single genotype of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one peptide of claim 32 to form 
an immune complex with the antibodies, 
amd 

20 (b) detecting the presence of the immune 

complex. ^ 

34. The method of claim 33, wherein the 
biological sample is selected from the group consisting of 

25 serum, saliva or lyn5)hocytes or other mononuclear cells. 

35. The method of claim 33, wherein said peptide 
is bovmd to a solid support. 

30 36. The method of claim 33, wherein the immiine 

complex is detected using a labelled antibody or antigen. 

37. A kit for use in detecting antibodies 
specific for a single genotype of HCV, said kit comprising: 
35 at least one peptide selected from the genotype-specific 



wo 96/05315 



PCr/US95/10398 



- 246 - 

o 

peptides of claim 32, 

38. Substantially purified and isolated 
universal peptides having amino acid sequences deduced from 
universally conserved amino acid domains found in SEQ ID 

^ NO: 52 through SEQ ID NO: 102, in SEQ ID NO: 155 through SEQ 
ID NO: 206, or in consensus sequences shown in Figures 2A-H 
and 7A-K. 

39. A method of detecting antibodies against all 
^® genotypes of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one peptide of claim 38 to form 
an immune complex with the antibodies, 
and 

15 (b) detecting the presence of the immune 

complex. 

40. The method of claim 39, wherein the 
biological sample is selected from the group consisting of 

20 serum, saliva or lymphocytes or other mononuclear cells. 

41. The method of claim 39, wherein said peptide 
is bound to a solid support. 

25 42. The method of claim 39, wherein the immune 

complex is detected using a Icdselled smtibody or antigen. 

43. A composition comprising at least one 
of claim 32 and an excipient, diluent or carrier. 

44. A composition comprising at least one 
of claim 38 and an excipient, diluent or carrier. 

45. A method of preventing hepatitis C 

35 infection, comprising administering the composition of 



peptide 

30 

peptide 



W0 9dA>5315 



PCT/US95/10398 



- 247 - 

claims 43 or 44 to a mammal m an effective amount to 
stimulate production of a protective antibody. 

46. A vaccine for immunizing a mammal against 
hepatitis C infection, comprising at least one peptide 

^ according to claims 32 or 38 in a pharmaceutically 
acceptcLble carrier. 

47. A composition comprising at least one 
expression vector capable of directing host organism 

10 synthesis of a genotype -specific peptide having amino acid 
sequence deduced from a genotype -specific amino acid domain 
located in SEQ ID NO: 52 - SEQ ID NO: 102, and SEQ ID NO: 155 
- SEQ ID NO: 206, or in consensus sequences shown in figures 
2A-H and 7A-K. 

15 

48. A composition comprising at least one 
expression vector capable of directing host organism 
synthesis of a universal peptide having amino acid sequence 
deduced from universally conserved aimino acid domains found 

20 in SEQ ID NO: 52 - SEQ ID NO: 102, and SEQ ID NO: 155 - SEQ ID 
NO: 206, or in consensus sequences shown' in figures 2A-H and 
7A-K. 

49. A method of preventing hepatitis C 

25 infection, coit^rising administering the coitposition of 
claims 47 or 48 to^a mamrnal in" an effective amount to 
stimulate production of a protective antibody. 

50. A vaccine for immunizing a mammal against 

30 hepatitis C infection, said vaccine comprising at least one 
expression vector capable of directing host organism 
synthesis of a geno-type specific peptide having amino acid 
sequence deduced from a geno type-specific amino acid 
domain located in SEQ ID NO: 52 - SEQ ID NO: 102, and SEQ ID 

35 NO: 155 - SEQ ID NO:206, or in consensus sequences shown in 
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15 
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figures 2A-H and 7A-K. 

51. A vaccine for immunizing a mammal against 
hepatitis C infection, comprising at least one expression 
vector capcQ)le of directing host orgcuaism synthesis of a 
universal peptide having amino acid sequence deduced from 
universally conserved amino acid domain found in SEQ ID 
NO:52 - SEQ ID N0:102, and SEQ ID NO:155 - SEQ ID NO:206, 
or in consensus sequences shown in figures 2A-H and 7A-K. 

52. Anti-HCV core antibodies having specific 
binding affinity for core protein of a single genotype of 
HCV. 

53. Anti-HCV envelope 1 antibodies having 
specific binding affinity for envelope 1 protein of a 
single genotype of HCV. 

54. The antibodies of claims 52 or 53 wherein 
said antibodies are monoclonal antibodies. 

55. A method of detecting core protein specific 
for a single genotype of HCV, said method con^rising: 

(a) contacting a biological sample with at 
least one ajitibody of claim 52 to form 

25 an immune complex with said core 

protein, and 

(b) detecting the presence of the immune 
complex. 

30 56. A method of detecting El protein specific 

for a single genotype of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one antibody of claim 53 to form 
an immune complex with said El protein; 
35 and 



20 
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(b) detecting the presence of the immune 
complex. 

57. The methods of claims 55 or 56, wherein the 
biological sample is selected from the group consisting of 

^ serum, saliva lymphocytes or other mononuclear cells and 
liver. 

58. The method of claims 55 or 56, wherein said 
antibody is bound to a solid support. 

10 

59. A method of detecting antibodies against all 
genotypes of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one universal peptide of claim 38 

15 to form an immune complex with said 

ant ibodies ; and 

(b) detecting the presence of the immune 
complex. 

20 



25 



30 



35 
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SEO ID KO: 




5 


SI4 


1 


DK7 


8 


USll 


4 


DR4 


3 


DRl 


2 


DK9 


6 


S18 


7 


SWl 


X-8 


coaaenBUS 



FXOOU lA 

1 TACCAACrrGCGCAACTCCACGGGGCTTTJ^CAT^ 
1 TACCAicTGCGcii^ 



1 TJlCCAJtfn*aCgCAACTCgJ^MQas^^ 

1 clcciloTGCGcl^ ft T CSaTTCggCTAftTTCSnCTA 



1 caCCflAGTGCGCAAg'rg^gAC^ 

1 xlcciicTACGciicTC 



1 TACCAAGXACGCAACTCCaCGGGCCinACCATG^ 

tACCAAiOT - CGCAACTCcaCgGGgCTtTACCATGTcACCAATGAtTGCCCXAACTCGAGtA 

SEP^IP TO; lafllatft TtCntgTACGASaCaGCtGATOCtJLTOT 
1 0K7 62 TcGTGTACGAGGC 

a USll 62 Treiiiicii^ 



4 0R4 62 TTGTGTACGAGGCGGCCGATGCCATCCTGa«»CGCCGGGGTGTGTCC(^^ 



3 DRl 62 TTCrrGT^^ 

2 DK9 62 Vro TCT A CGAGGCGGCOGATGCCATCCTGCAtTCTCCa<^ 

7 SWl 62 TOTOTACOAGACtMCC^ 

1 - 8 consenBUB TtCTGTACGAOgCgaCcOJa^CcATcCTgCAc-CtCCgGGgTGTGTc^^^ 



SEO ID NO: 




5 


S14 


1 


Dja 


a 


OSll 


4 


DR4 


3 


DRl 


2 


DIC9 


6 


S18 


7 


SWl 


1-8 


confiensuB 



123 CXKn'AACacCTCGAG UlYI ' IY^ I^^^ 
123 GGGXAATOtC^^ 



123 GGCTAACGCtTCGAG UimUtXKS TTCC^^ 

123 GGCnAACaCCTCGAWji^^^ 

123 GGCTAACGCCTCGA GUim Tt ^ ^ 

123 GGCTAACG^^rCG^^ 



123 GGGTAAOSOTCGAgATOT 



23 GOaTggCGCCcCCy^gTCnTGGGTGgCGGTGGCCCCaiCAGTc^ 

GGgTaaCgcctOyiggTGTTGGGTGgCGgTGaCCCCCACgGTgGCCACcJ^^ 



wo 96/05315 



PCTAJS95/10398 



2/89 
FZODU lA 



SEO ID NO: 
5 


iBolate 
S14 


1B4 


1 


DK7 


184 


8 


USll 


184 


4 


DR4 


184 


3 


DRl 


184 


2 


DIC9 


184 


6 


S18 


184 


7 


SWl 


184 


1-8 






SEO ID NO: 
5 


S14 


245 


1 


DK7 


245 


8 


Sll 


245 


4 


DR4 


245 


a 


DRl 


245 


2 


DK9 


245 


6 


S18 


245 




SWl 


245 


1-8 


consensus 




SEO ID NO: 
5 


iBQlatft 
S14 


306 


1 


DK7 


306 


8 


Sll 


306 


4 


DR4 


306 


3 


DRl 


306 


2 


DIC9 


306 


6 


S16 


306 


7 


SWl 


306 


1-8 


consensus 





lllill II llllllllllllll llillllllllll IIIIIIII IIIIII IIIUII 

CTCCCCACAgCGCAGCTrCGACGTa«JlTCGATCTX3CrcGTCGGGAGtGC^^ 

iiiiiiiii IIII iiiiiiiiiiniiiiiiiiili IIIIIIII iiiiii imiii 

CirCCCACAACGCAaCTTCGACGTCACATCGATCTGCTTXmiGGGAGCGCCJVCCCTCT^ 

iiMiiiiiiiiii II llllllllllllll in mii niiiiiiiiiiiiiin i 

CrCCCCACJU^GCAGCrcCGACGTCACATCGACClXjCi'iXJTCGGGAGCGC 

llllllllillllllll llllllllllllllllll lllllll Ullllllllllllllll 

CTCCCCACAACGaUS LTXtXi ACCntJlCATCGACC U XiLTlV 

null tiiiiiiiiiiiiiiiiiiiiiiii iii iiiii iiiiiiiiiiniiiiiiii 

CTCCCCGCAAOSCAGCTrCGACGTCACATCGATCTGCXTGTCGGGJ^ 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii iim I in I II II mil Mill 

CTCCCCGCJUlCGCAGCrrTCGACGTCACATaSATCTGCTre 

inn iiiiiiiiiiiiiiiiiniiiiiiiiii niiii n niiiinniiiin 

CrCCCtGCAACGCAGCTTCGACGTa^TCGATCTGCXTGTcGGaAGCG^ 

CTCCCc - CAaCGCAgCTt CGACGTcACATCGAtCTGCTtGTcGGgAGcGCCACCCTCTGrt 



CGGCCCTCTACfflGGGGGJlCtTOCGCGGGTCTGTCITlCTTG^ 

inniiniiniiiini iiiiiii iiiiiiiiiiiiiii iiiiii iiiiii nni 

CGGCCCTCIA£GTGGGGGACCTGTGGGGUlV11alVl'i'lVi'lX«'i'CGGTCA^^ 

iinniHinnniiiinnni nniiinnnnin niii niiiii nii 
nil I II nil linn iiiiiiniiiiiiii nninnnnnn inn 



CXXSCCCTCTACCTGGGGGACt 

iiiiinnnniiiiin ini 
???ifffnTTm??m'^ 

CGGCCCrCTATGTGGGGGACt' 

iiiiiiiiiiiinnnii iiini 

CGGCCCTCTATGTGGGGGAC( 

iinnnii innnii iiiiininnii 

CGGCCCTCTAcCrrGGGGGACt' 
CGGCCCTCTACGTGGGGGAC - TGTGC 



ii iinni iiiijiiiiinnininiiii 

'CTGTCTTCCTTGTCGGTCAACTGTTCACCTT 

iiiHiiiiii niin ii II I II! I II I II II 

iCTTGTCGGCCAACTGnCACCrr 

III iiiiii IIII IIIIII II I 

CCAaCTGTTCACtaT 

II inn n niinii i 

cGTCAGtCAaCTGTTCACgtT 
PtCTtGTCgGtCAaCTGTTcACctT 



CTCTCCCAGGCQCCtCTGGACGACGCAftGaCTGCA AriG TT C T A TCTATCCcGG 

linn III III 1 1 III 1 11 IIII nil iiiiiiiniiiiiiiiiiii inn nil 

CTCTCCCAGGCGCCACTGGACGACGatftfy^CItXW ir fG' n^ 

niiinn nninnnnnn innnininniiiniii iiiiiiiii 

CTCTCCCAGaCGCCACTGGACQACGOXgGGCTOCAATrGTTCTATC^^ 

innnii I iininn inn i ininnnn Miiiiniiiiiiini 

CrCTCCCAGGCaCCACTGGACAACCCAASACTGOUlTTGTTCc^^ 

innnni niiiiniininnnnninnii niiniiiiiiiniii 

tTCTCCCAGGCGCOlCTGGACAACGCAAGACTXSCAATTGITCTATCTATCCCGGCCAT^^ 

II inn I II II I II II II II n n 1 1 1 II II I i ii ii n ii ii iiiiiniin 

CnrCCCOtfSaCGCCACTGGACAACGCAASACTGCAACItjn'lCiAT^ 

iiiiinn iiiiiiiiiiniiiiiiiiiiiiii iiiiiii iiiiiniiiiiiiiii 

CTCCCCCAGGCGCCACTGGACAACGOUtfyiCrrGCAACTGTT^ 

iiiiiiniiiiiiiiiiniiiiiiniiiii iniiiiiiniii nniin in 

CTCCCCCACKSCGCCACTCXaACAACGCAAGACTGtAACTtriTCTAT 
cTCtCCCAQgCgCCaCTCKSACaACGa^GaCTGcAAtTGTTCtATCTAtCC^^ 
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SEO ID NO: 




5 


S14 


X 


DIC7 


8 


SIX 


4 


DR4 


2 


DRX 


2 


DK9 


6 


SX8 


7 


swx 


1-8 


consensus 
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FZGOU lA 









5 


S14 


jb / 


1 


DK7 


367 


8 


SIX 


3S7 


4 


DR4 


367 


3 


DRX 


367 


2 


DK9 


367 


6 


S18 


367 


7 


SWX 


367 


1-8 


consensuB 




S£Q IP ro; 


Isolate 




5 


SX4 


428 


1 


DK7 


428 


8 


SXX 


428 


4 


DR4 


428 


3 


ORl 


428 


2 


DK9 


428 


6 


SX8 


428 


7 


SWX 


428 


1-8 


consensus 





Mill 

ACGGC 
gCGGC 

r^ww^jw * ^— -■'^ w mm ■» » w — - ■ — -w-— — — ^ — --^ — - "j — "| — Jlljlltllllltlll i 111 



II II 

CAtCGcATGGC 
II IMII 

Mill 



ACGGGtCJ^GcATGGCaTGGGATATGATGATGAACTGGTCCCCtACgaC -GCgeTGGTag 



I II' 



428 TAGCTCftSCT G CT C AGGaTCCCGCAAGCCGTCTTGGACATG&TCGCTtySTC 

TaGCtCAGCT G CTCcGGaTCCC • CAaGCCaTCTTGGAcATGATCGCTGGtGCcCACTGGGG 



489 

489 AGTCCTgGCGGGCOT 



489 AfiTCCTAGCGGG»TAGC^ 
489 AGTCCTAGCGGGCATWSOn'ATrraraiTGGT^ 
489 AGTCCTAGCGGGCATASCGTATTTCTCC»TGGTGGGG 
489 AGTCCIAOCGGGCATAGCGTATITCT^^^ 

489 Aci^CTLyrGGGC^ 

AGTCCTaGCGGGCATAOCGTATrrcTCCATGGtGGGgWlCTGGGCGAAGGTCcTggTaGTg 
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PZGimX lA 



SEO ID NO: 




5 


S14 


1 


DK7 


a 


USll 


4 


DR4 


3 


DRl 


2 


D1C9 


6 


S18 


7 


SWl 


1-8 


consensua 



550 CTGCTGCTATTcGCCGGCGTtQACGCG 

lllllllllll llilllll illlM 
550 CTGCTGCTATrTGCCGGCGTCGACQCG 

lllllllllltllllMllllllim 
550 CTGCTGCTATTTGCCGGCGTCGACGCG 

III MM lllllllllll 11 ill 

550 CTGTTGCTGTTrGCCX^GCGTTGATGCG 
. I III ll llll III I II I I II I I II HI 
550 CTGTTGCrGTTTGCCGGCGTTGATGCG 

iiiiii iiiiii iiiiiii nun 

550 CTGTTGCTGTTTaCCGGCGTCGATOCG 

iiiniiiiiii iiiiiiiiimii 

550 Ci'U'ri'bC'i'Ul I'lgCCGGCGTCGATGCG 

iiiininiii II n II n II n 1 1 

550 CTGTTGCTGTTTtCCGGCGTCGATCCG 



CTGtTGCTgTTtgCCGGCGTcGAtGCG 
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PIOORS IB 



SEO ID NO: 
11 


Isolate 
DKl 


1 


TATGAWTTGCGCAACGTGTCCCKKSgTGTACCAcGTCi^aAAOT 


24 


TIO 


1 


TATXSAAGTGroCAACCT 

TATGAAGTX^CGCAACGTCnrCGGGGTGTACa^GTa^cA^ 


10 


D3 


1 


9 


Dl 


1 


TAiGAAGTGCGCAA 


14 


HK5 


1 


15 


KK8 


1 


TAT(^AGTCCGCAAC^^ 


12 


HK3 


1 


TATGiiiiGCGCAACGT^^ 
TAcoiioTGCGci^ 


23 


T3 


1 


22 


SW3 


1 


TATGAAGTGCGCAAOT 

TATCAgCTTGOTciA^^ 

TATOAACTGCGCAAC^^ 


17 


110)8 


1 


16 


INDS 


1 


21 


SAIO 


1 


TATGAAgTtyrGC^^ 


20 


S45 


1 TATGAAGTGCGOUlCGTGTCCGGGgcGTACaOTnxaC^ 


25 


US6 


1 


TATGAAGTGCGCAACGTGTCCGGGATGTACCATGTCACGAJ^ 


13 


HK4 


1 


cATC^ACTCcicAACCT 


IB 


PIO 


1 


TATt^ACTGCGCA^ 


19 


59 


1 


TATGAATOCGCAACCT 


9-25 


consensus 




tAtGAaGTGCgCAACGTgTCCGGGgtgTAccAtGTCACgAAcGACTGcTCCAACTca^ 



•mnj 
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FIGDRS IB 



SEO ID NO: 


Isolate 




11 


DKl 


€2 


24 


TIO 


62 


10 


D3 


62 


9 


Dl 


62 


14 


HK5 


62 


IS 


HK8 


62 


12 


HK3 


62 


23 


T3 


62 


22 


sm 


62 


17 


INDB 


62 


16 


IND5 


62 


21 


SAID 


62 


20 


S45 


62 


25 


US6 


62 


13 


HX4 


62 


18 


PIO 


62 


19 


S9 


62 


9-25 


consensus 






62 TTGTGTATGAGOCAOCGGACgTGATCc^CACACCCCtGGGrrGCGTO 
TTGTGTATGWKSCAGCGGACATGATCATGCACACtCCCGGGTGCGTGrc 



II 



TTGTOTATGAGGaUjCGGAC:ATGATCATGCAtJU:CCCCGGGTGCCr^ 

rGTGTATGAGGCAGCGGACATGATaATGCAcACCCCCGG(?TGC6T^ 
TTGTCTAcGASGC^^ 
TtGTGTatGAggCAgcgGftCaTGATcaTGCAcACcCCcGGgTGcgTgCCCTGcGTtCg^^ 
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FIGURE IB 



SEO ID NO: 








11 


DKl 


123 


Ga&CAACcaCTCCCGtTGCTtWGTAGCGCra^C^ 
GGgCAAL 1 CL 1 CCCUCl XiC lX»GGTi\GCG 


24 


TIO 


123 


10 


D3 


123 


GGACAACTCCTCTCGCTGCTGGGTJtf»CGCTCACCCC» 
G<yiCW^CTCCTCTCGCTGCTGGG^ 

1 1 1 1 1 1 1 1 1 1 II II 1 1 i 1 II 1 1 1 1 II 1 1 M 1 11 M 1 1 1 1 1 1 1 1 1 1 II 1 II 1 
aAACAACTCCTCCCG^ 


9 
14 


01 
KK5 


123 
123 


15 


HK8 


123 


GiAACTVACTCf^f^f^f^GT^cTCXjGTgGCGCTCACTC i 

MlillllMIIIII II mil lllllllllltllllllllllll III It Mini 

GAACAACtCCTCCCGCTGtTGGGTJUSCGCT^^ 


12 


HK3 


123 


23 


T3 


123 


11 III lllllllllll IIIIMIIIII lllilllllllMlllllllllill UN 
GAgCJU^tTCCTCCCGLlXiClXjGGTWiCGCTt^ 


22 


SW2 


123 


GGcCAicTCCTCCCGCTGCTGGGTAGCGCTCACTCCCACGCT 

II lllll III 1 IIIIIIIIIIMIIIIIIItlll II 11 II Mini Mill 

GGGCAACTtCTCTaatTOCTGGGTAGC^ 
GGGciimCTCT^ 


17 


IND8 


123 


16 


IKD5 


^23 


21 


SAIO 


123 




20 


S45 


123 


iiiciitTCcrc 

GAACWlCTCCrCCro 


25 
13 


use 

HK4 


123 
123 


18 


PIO 


123 


GggtlicrccrcCCaaTC 


19 


S9 


123 


9-25 


consensuB 




gaacAActcCTCccgcTGtfTGGGTaGCGCTcaCtCCCACgCTcGCgGCcAGGAAcgccAgC 
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SEP ID NO: XSPlatg 



11 DKl 184 aTCCCCJlCTACKSACaATAaSACGCCATGTCGArrrGCTCGTTGGGGCGGC ^ 

24 TIO 184 GTCCCCACTACCy«:gATACGJ^GC»TGTCGATTTGCTCGTTG<XX^^ 
10 D3 184 GTCCCCACTACGAC^TJ^GACGCCACGTCGATTrGCT CG T i ^ G GGGC^ 

9 Dl 184 GTCCCCACTACGGCgATACGRCGCCACGTCGATITGCT CGrit:^ ^ 

1 4 HK5 184 GTCCCCACcACGGatflTACGACGCaVCGTCGACTTGCTC G Tr ^ ^ 

15 HK8 184 GTCCCCAC t ACGACAATACGACGCCftCGTCGACrrGCT CG TT t^ ^ 

12 HK3 184 GTCCCCACcJUXSACJUCTtfXSACGTCACGTCGA L T^ 

23 T3 184 GTCCCCACTAaGACAATACGACGTCACGTCGA LTlULTLVX n Xi GGGC^ 

22 SW2 184 GTCCCCACrACGACAATACGACGCCACGTCG A ' mUCiVUntj ^ 

17 IND8 184 GTCCCCACaUXUCJlATACGACGCCJlCGTCGAr ri t JC i V^ ^ 

16 IND5 184 GTC t CCACCACGACAJlTACGAC&CCACGTCCa ArXUTiClVUriX; ^^ lljU i ' lTL lU i ' l 
21 SAIO 184 GTCCCCACTiUXSACAATACGACGCCACGTCG ArnXA. ^ ^ 

20 S4S 184 GTCCCOLCT^ 

25 US 6 184 GTCCCCACTACGACAA T ACGACGCCACGTCGATX'itjClCUl'iXjG^ 

13 HK4 184 aTCCCCACTACGACAATACGACGCCATGTCGAcTTGCT CG ' nti GGGC ^ 

18 PIO 184 GTCCCaACTACGgCAATACGACGCCATGTCGATTTC 

19 S9 184 GTCCCcACcACGaGAATACGACGtCATGTCGATTrGCTCGTTG^ 

9-25 conseneuG gTCcCcACtAcGaCaATACGACgcCAcGTCGAtTTGCTCGTrGGGGCGGCT^ 
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SfiP IP WQ; 
11 

24 

10 
9 
14 
15 
12 
23 
22 
17 
16 
21 
20 
25 
13 
18 
19 
9-25 



DKl 
TIO 
D3 
Dl 
HK5 
HK8 
HK3 
T3 
SW2 
IND8 
IIIDS 
SAIO 
S45 
0S6 
HX4 
PIO 
S9 

consensus 



9/89 

FXOOU IB 



245 CCGCTATGTAcGTGGGgGACCTCTGCGGATCcGTTTTCCTCGTCT 



245 CCGCTATOTAtGTOGGaGACCTCTCCGGATCTCT 
245 COTCCATCTAC^TI^^ 



245 CCGCCATGTACGTGGGGGATCTcOTCG<»TCTGTTIT^^ 
245 CCGCTATCTACGT^^ 

245 CCGCTATGTACffroGGOGATCTCTGaWATCTXnTITCCTCGTCTC 



245 CCGCrATGTACGTGGGGGATCTCTGCGGA' 



'tGTCTCCCAGCTGTTCACCTT 
rCAGCTGTTCACTTT 




245 COJCTATCSTACGTGGGGGATCTaTGCGGA' 

nil llllllllllllll II illlll 
245 CCGCcATtnACGTGGGGGAcCTCTGCGGA' 



245 CCGCTATGTACGTGGGGGATCTCTGCGGA'OT 

rcGTCICCO^ 

CCTT 

245 CCGCTAlOTACGTCGGGGAtC^^ 





245 CCGCTATGTACGTGGGGGAcCTCraCGGgTCcG| 

245 CMCcATCTACGTGGGaGATC^^ 

245 CCGCTATGTACGTGGGGGATCTCTC^^ 

245 CCGCTATCTACGTCGGGGAcCTgTGCGGATCTGTTtTCCrCaTCTCCC^ 

CCGctATGTAcGTGGGgGAtCTcTGCGGaTCtGTttTCCrcgTcTCcC»G<?^^ 



wo 96/05315 



PCTAJS95/10398 



10/89 
PZ6UU IB 

SBQ IP WO; 

11 DKl 306 tTCaCCTCGCCGGCATGAGACagcaCAGGJVCTGCAACTGCTCAATC^ 

24 TIO 306 CTCGCCnrGCCGGCATGJlGACttTgCAGGACTOaUUrrGC^^ 
10 D3 306 CTCGCCTCGCCGGCATGiUSACaGTACAGGAaTGTAACTGC^^ 

9 Dl 306 CTCGCCTCGCCGQCATaAGACGGTAaUSGAgTGTMtTGCTa^ 

14 HX5 306 CTCGCCTCGCaSACACOAOiUrGGTAaiGaACrGCAAC^^ 

15 HK8 306 tTCGCCTCGCCGACACGAGACGGTACAGGACTGCAACltSClt : AATCT 

12 H1C3 306 CTCGCCTCGCCGACACGAGAa«?rACAGGACTGC^^ 
23 T3 306 CTCGCCTCGCCGGOltGJ^aurXTiaUGGACTG 

22 SW2 306 tixjJrcrrcGCCTCC^^ 

17 IMD8 306 CTCACCGCGCCGGCATtSAGACACrriUJUXSAC^ 

16 IND5 306 CTCACCGCGCCGGCATGAGACAGTACAGGACTGaUTXTKr^ 
21 SAIO 306 CTCGCCTCGCaXStATGAGACAGTRCAGGACTGCA ft T TC 

20 S45 306 CnXXKrCnrGTOKTCATQAGACAGTACAGGACra 

25 US6 306 CTCGCCTCGTCaGCATGAGACACn^ACAGGACrGC^ 

13 HK4 306 CTCGCCTCGCCGGCATGJUSACgGTACAGGACTGCAATTGc^^ 

18 PIO 306 CTCaCCTCGCCGGCATtgGACAGTACAGGACTGCAATItStTC^ 

19 S9 306 CTCgCCcCGtCGGCATgaGACAGTACAGaACTGCAATTOdTCAAT^^ 

9 - 25 consensus crrCgCCtaScCggcAtgaGACagtaCAGgAcTGcMcTGcTCaaTCTATCCcGGcCacgTa 



wo 96/05315 



PCT/US95yi0398 



11 

24 

10 
9 
14 
15 
12 
23 
22 
17 
16 
21 
20 
25 
13 
IB 
19 
9-25 



PZOORZ IB 

DKl 367 TCAGGTCJlCCGCATtXKrrrGGGAtATGATOATGAACTGGTC&CCTAC^ 

iiiiiiiiiiiiiiiiiiiiiii iiiiiiiiiiiiiitii tiiiiiiiiii mill 

TIC 367 TCiU»3TCACCGCATGGCITGGGAcATGATGATC3AACTGGTCGCC^^ 

lllllllllllltllllllll! Iltlllllllilllllllllllll nil lllllll 

D3 367 ACAGGTCACCGCATGGCTTGGGATATGJmUlTGAACT G CmiGCCTACAgO^ 

lllilllllll IIMIIIItllllllllllllllllllll llllll llltl lllill 
Dl 367 ACiUKntJUrCGtATGGCTTGGGATATGATGATGAACTGGTCACCT 

lllilllllll lllllllllllllllllllllllllllllllillllllllll llllll 
HKS 367 ACAGGTCACCGOllXK^CTTGGGATATGATGATGAACTGGTCACCTACAACi^ 

llllllllllllllllllllllllllllllllllllllll II llllllltllllllll 
HK8 367 TCAGGTCACCGCATGGCriTKKlATATGATGATGAACTGGTCgCCcACAACAGCCCT 

lllllllllllllllltlillllllllllllllllllllll II III llllllllllll 
HK3 367 TCAGGTCACCGCATGGCTTGGGATATGATGATGAAClt ^G TCcCCtJtf^ 

llllllllll lllltlllllllllllllllllllllMI II III I II lllllll 
T3 367 aCJUKnCACCGtATGGCTTGGGATATGATGATGAACTGGTCgCCcACAaC 

llllllllll lllilllllll llllllllllllllllt It III I II II INI 
SW2 367 TCAGGTCACCGCATGGCITGGGAcATGATGATGAACTCGT^ 

iiiiiiiiiiiiiiiiiiititi iiiiiiiiiiiiiiiiiiiiiiiiti iiiit nil 

IND8 367 TCAGGTCACCGCATGGCTIGGGATATGATGATGAACTGGT C ACCTAC^ 

II It II III 1 1 nil 1 1 It tiiiiiiiiiniii llllll It lllllll III III nil 

IRD5 367 TCAGGTCACCGCATGGCcnTGGGATATGATGATGAACTGGTCACCTAC^ 

iiiiiiiiiiniiii niiiiiniiinntiiiiiiiiiini nii nin i 

SAID 367 ACAGGTCACCGCATGGCTZtX^GATATGATGATGAACTtXr^^ 

iiiiiiniiiiiiinniiiiiiiiiiiiiiiiiiiiii in in iiii iiii t 

S45 367 ACAGGTCACCGCATGGCTTGGGATATGATQATGAACTGGTCgCCTACM 

iiiniiininnniiiniiiiiiniiii inn nnininn inni 

nS6 367 TCAGGTOUrCGCATGGCTTGGGATATGATGATGAAtTGGTCACCTAC^^ 

iniiiiiniiiiiiiniiiiinniiiiin iniiiiiiiiinniiiiiiiii 

HK4 367 TCJUSGTCACaSCATGGCrrGGGATATGATGATGAACTGGTCA^ 

iiiiiiiiiniiiiiiiiniiniiiniiiniitiii it inniiiiiiiiiii 

PIO 367 TCAGGTCaCCGCATGGCTrGGGATATGATGATGAACTGOTCGCCc^^ 

iitiiit iiiiiiii iiiiiiiiniiiiiiiiiniiiii III iiniiiiiiii 

SS 367 aCAGGTCRtCGCATGGCcTOGGATATGATGATGAACTt XS TC G CCtACAaCAGCC^ 

consensus tCAGGTCAcCGcATGGCtTGGGAtJVTYSATGATGAAcTGGTCaCCtACAgCaGCcc^ 



PCrAJS95/10398 



12/8^ 
FXOnU IB 

Sro IP Wg; i£2lA££ 

11 DKl 42 B TaTCGCAGTTACTCCGaATCCaUJUlGCTGTCgTGGftCATt^^ 

24 TIO 428 TgTCGCJU?rrACTCCGGATCCCACAAGCTGTCaTGGAaiT^^ 
10 D3 428 TATCGCAGTTACTCCGGATCCCACAAGCICTCgTGGACAT^ 

9 Dl 428 TATCGCAGTTACTCCGGATCCCACJUUKnxnCaTGGACATXXr^ 

14 HK5 428 TGTCGCAGTTACTCCGGATCCCGCAAGCTGTCGTGGACA^^ 

15 HK8 428 TCntrGCAGTTACTCCGGATCCCGCAAGCTaTCGTGGACAT^^ 

12 KK3 428 TCnCGaUtTTACTCCGGATCCCGOJU^^ 

23 T3 428 TGTCGCAGTTgCTCCGGATCCCACAAGCTC?ICGTGGAC^^ 

22 SW2 428 TArCGCAGTTaCTCCGGATCCCACAAGCTGTCGTGGACATGGTaGCGC^^ 

17 INDd 428 TATCGC»GTTGCTCCGGATCCCACAAGCntnXXnX^ 

16 ZHD5 428 TATCGaUiri UC lt . CGGATCCCACAAGCT G T CG113 GATATGGTGGC^^ 
21 SAIO 428 TATCGCAGTTACTCCGGATCCCACAAGCTaTCGTGGACATGGTGGC^^ 
20 545 428 TATCGCAGTTACTCCGGATCCCACAAGCTGTCGTQQAC^^ 

25 aS6 428 TATCGCAGITACTCCGGATCCCACAAGCTGTCATGGACAT^^ 

13 HK4 428 TATCGCAGTTACTCCGac^CCCACAAGCTGTCATGGACATGGT^ 

18 PIO 428 TgTCGCAGCTACTCCGGATCCCACAAGCTaTCtTGGATgTGGTG^ 

19 S9 428 TaTCGC^GCTAm 

9-25 consensuB TaTCGDlgtTaCTCCGgaTCCCaCAAGCrgTCgTGGAcaTGGTgg^ 



wo 96/05315 



PCr/US95/10398 



13/89 
PZGUU IB 

SEP ID WO: Ififllatfi 

11 DKl 489 AGTCCTGGCGGGCCTcGCCTACTAc^rCCATGGCGGGGAACTGGGCcJ^^ 

24 TIO 489 ACTCCTGGCGGGCCTtGCCTACTATICCATGGCGGGGUUVCTGGGCT^^ 
10 D3 489 GCTCCTGGCGGGCCTCGCCTACTATTCCATGGTGGGGAACTGGC^^ 

9 . 01 489 GGTCCTGGCGGGCCTCGCCTACTATTCCATGGTGGGGAACTGGGCTAA 

14 HX5 489 GGTCaxySCGGGCCTTGCCTACTATrCCATOGTGGGaAACTG^ 

15 HK8 489 AGTCCTAGCGGGCCTTGCCTACTATTCCATOGTGGGcAACTGGGCT^ 

12 HK3 489 AGTCCTAGCGGGCCTTGCCTACTATTCCATGGTGGGaAACTX^ 
23 T3 489 AGTCCTGGCGGGCCTTGCCTACTATrCCATGGTGGGGAACT^ 
22 SW2 489 AGTCCTGGCGGGCCTTGCaTACTATTCamKmXXK3AAC^^ 

17 IHD8 489 AATCCTGGCGGGCCTIGCCTACTATrCCATGGTAGGGAACTGGGCT^^ 
IS IND5 489 AATCCTGGCGGGCCTRSCCTACTATXCCATtKn*AGGGAACT^^ 

21 SAIO 489 ACTIXrCTaGCGGGCCrTGCCIACTATrCCATGGTGGGGAACXtXK^ 

20 S45 489 AGTCCTGGCGGGCCirGCCrACTATTCCATGGTGGGGAACT^^ 

25 nS6 489 AGTCCTGGCGGGCCTTCCCtACTATTCCATGGTGGGGAACT^^ 

llllll lllllllllll Mill llllllllllllllllll m ill II I MIII 

13 HK4 489 AGTCCTaGCCXyS CLTitj CtTACTATTCCATGGTGGGGAACTGGGCcAAG Ur^ ^ ^ 

18 PIG 489 AGTCCTGGCGGGCCTTCCCTACTATTCCA^ 

19 S9 489 AGTCCTGGCGGGCCTcGCCTACTATTCCATGGTGGGGAACTG^ 

9-25 consenauB agTCCTgGCGGGCCTtGCcTACTAtTCCATGGtgCKSgAACTGGGCtAAGGTttTgATT^ 

A fA A 



wo 96/05315 PCT/US95/10398 



14/89 
PXOORS IB 



SEO ID NO: 


Isolate 






11 


DKl 


CCA 

550 










iiiiiiiiiillllllllllllllll 

* ' ' ' 'JJJLLLLl' ' ' ' JULLLi' ' ' ' 


24 


TIO 


550 


ATGCTJV^TCTTTGCCGGCGTTGATGGG 








1 1 1 1 1 1 1 1 1 1 M 1 1 Mill II II 


10 


D3 


550 










Illllllllllllllillll llllll 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ll [1 1 1 1 1 


9 


Dl 




AXVJWXAh* 1 C X 1 X X uuWu X X vaM%*\JU^ 








iiiiiiii IIIII iitiiiii II 
1 1 1 1 1 1 1 1 II 1 1 1 IIIIIIII II 


14 


HK5 


550 


AX\>U X AU XX X X X\i UUVsVvUu nuAiuuu 








IIIIIIII llllllltllllllllll 
1 1 i i 1 1 1 1 1 1 1 1 1 1 1 I 1 t 1 1 1 II 1 1 1 


IS 


HKB 


550 


ATQ CXAU X g^X X XVCCCVsC^ X X\aAX^9V*v 








IIIIIIII llllllllllllllllll 

IIIIIIII llllllllllllllllll 


12 


HK3 


550 


ATGCTACTCTTTuCCuuCwTTuATuu^v 








lllllll lllllillllllllllll 


23 


T3 


550 


CT6CTACTCTTTGCCGGCGTTGA.TGG6 








lllllllllllll IIIIIIII III 


22 


Stf2 


550 


ATGCTACTCTTTGCtGGCCTTGACGGG 








llllllllllllll llllllllllll 


17 


INDe 


550 


ATGCTACTCTTTSCCGGCGTISACGGG 








lllllllllllllllllllllllllll 


16 


IND5 


550 


ATGCTACTCnTGCCGGCGTTGACGGO 








lllllllllllllllllllllllllll 


21 


SAIO 


550 


ATGCTACTCrTTGCCGGCGTTGACGGG 








lllllllllllllllllllllllllll 


20 


S45 


550 


ATGCTACTCTTTGCCGGCGTTOACQOO 








iiiiitiiiiiiiiiiiiiiiiiiii 


25 


ns6 


550 


tTGCTACrCTTTGCCGGCCnTGACGGG 








Iiiiiiiiiillllllllllllllll 


13 


HK4 


550 


ATGCTACrCTTTGCCGGCGTTGAOGGG 








II III II III! mill mi HIM 


18 


PIO 


550 










tTfn^tTTTfm'ltTmm?* 


19 


S9 


550 ATGCTACTtTTTSCtGGtGTTGACGGg 


9-25 


consensus 




aTBCTACTCTTTGCcGGcGTCGAeGGg 



wo 96/05315 PCrAJS95/10398 



15/89 
FIODU IC 



SEQ 3tP TO: 
26 

27 

2B 

29 

26-29 



T2 
T4 
T9 
OSIO 
consenaus 



GCcCAAGTQAgOi^CACCAgccgCgGtTACATGGlXaACtJUlCGACTGT^ 

II lllllll llllllll I I lllllllllll lllllllllll inn nil 

GCaCAAGTGAAGAACACCAcTAaCAGCTACATGGTGACcAACGAC^ 

II iiniinniiiii ii iinniiinnn n iiiniii ii iiiini 

GCCgAAGTQAAGMCACCAGTACCAGCTACATGGTGACaAAT^ 

I innnn iinniiiiiiiiin niiiiii iiniiii iiiiiiinnii 

GtCcAAGTGAAAAACACCAGTACOUSCTAtATGGTGACcAATGAC^ 
GcccAAGTGAagAACACCAgtacCaGcTAcATGGTGACcAA-GACTGtTCcAA-GAcAGCA 



SEQ TO; 

26 



27 
26 
29 
26-29 



Isolate 
T2 

T4 

T9 
USIO 
consensus 



62 TCACcTGGCAGCTCCAaGCCGCGGTtCTCCACGTCCCCGGGTGTaTCCCGTGtGA^ 

nil iiiiiiiini iiiiiin iiiitiiiiiiiinnijiiiiii iin 

62 TCACtTGGCAGCTCCAGGCCGCGGTCCTCCACGTCCCCGGGTGTGTCCCGTGCGJ^^ 

nil mil iiiiiiiiniinininiiiniiiiiii niiniiiiiii i 

62 TO^dlXSGCAACrCCAGGCCGCGGTCCTCCACGTCCCCGGGTGcGTCCCGTGCGAGA^^ 

nil llllllll nil niiiiiiiniii iiiinii iniiiiniiii in 

62 TCACtTGGCAftCTtgASGCtGCGGTCCTCaUCT^CCCGGQTGtG^ 

TCAC - TGGCA- CTCcAgGCcGCGGTcCTCCACGTcCCCGGGTGt gTCCCGTGcGAGA- ag t 



SKQ ID NO: 
26 



27 
28 
29 
26*29 



T2 
T4 
T9 
USIO 
consensus 



123 GGGAAATi^TCcCGaTQCTGGATACCGGTcaCJ^CAAACGTtXSCCGTGCGGCJ^ 

ninnnin n niiiiinniii iniiiiniiiiiiiniiiiiiniii 

123 GGGAMTACATCtCGGTGCTGGATACCGGTtTCACCMU^CGTGGCCGTGCGGC^^ 

inn I II iinniiiiiiiiin ii nnini n nil n in in 

123 tGGJUUU:gCgTCgCGGTGCTGGATACCGGTCrCgrCCAAAC6TaGCtCn^ 

inn Ml iiiiiiniiiniiinn iini II II iiinnnniiii 

123 gGGMAtaCaTCt CGGTGCTGGATACOGGTCrcaCCAAAtGTgGCcGTGCA^^ 

gGGJUUlCaCaTCta^gTGCrGGATACCMGTctCaCCAAAcGTgGCcGTGC -GC - GCC - GGC 



^ SEP ID NO: Isolate /\ a 

' 26 T2 184 GCtCTtACGCAGGGCTOSCGGACGCACATcGACATGGTTGTGATGTCCGCC^ 

n n iiininniiiiiiiinni niiiiniiiiiiiiiiniiiiiiiiiii 

27 T4 184 GCCCTCACGCAGGGCTTGCGGACGCACATtGACATGGTTGTCATtnrCCGC»CX5 

iiiiiiinniiiiiiiiiiiiiilin iiiiiiiiiiiiiiiiiiiiiiiiiiiini 

28 T9 184 GCCCTCAOjCWKXOTTGCGGiUXSCACATCGACATCGTTGT^ 

iiiiiniinniiiiinin iniiiiinini iiniinii inn mini 

29 USIO 184 GCCCTCACGCAGGOCTIGCGGACtCACATCGACATGGTcGTGATGTCCGCCAC^ 
26-29 consensus GCcCTcJ^CAGGGCTTOCGGACgCACATcGACATGGTCGTGATGTCCGCCArc 



SEP ID NO: 
26 



27 
28 
29 
26-29 



Isolate 
T2 

T4 

T9 
USIO 
consensus 



245 CTXSCcCTCTACGTGGGGGACCTCTGCGGCGGGGTCATGCTCGCAOCC^^ 

nil II iiiiinimiiiinniimiimiinnnnniiiiiiin ii 

245 CTGCTCTtTACGTQGGGOACCTCTGCGGCGGGGTGATGCnrGCAGCCCA^ 

I inn iimnmi iiiiiiimim niiiiii ii minimi i 

245 CCGCTCTCTACGTGGGGaAtCTCTGCGGCGGGGTaATGCTCGCcGCtCAGATGTO^ 

imiii iiminiii mnn in i iiiiini n ii inmin i 

245 CCGCTCTtTACGTGGGGGActTCroCGQtGGGaTgATGCTCGCaGCcCAaATGT^^ 
C - GC tCT - TACGTGGGGGAccTCroaSGcGGGgTgATOCrcGCaGCcCAgATGT^ 



wo 96/05315 



PCT/US95/10398 



16/89 
FZGURS IC 



SfiQ IP WQ; 
26 

27 

28 

29 

26-29 



T2 
T4 
T9 
USIO 
consenBus 



306 CTCGCCGCgACgcCACTGGTTTGTGCAAGAaTCCJUlTTGCTCc^ 

llllllll II IIIII IIIIIII IIHI lllllllllll llllllll II IIIIII 
306 CTCGCCGCAACAtCACTGGTITCnGOUUSAcTGCMTTGCTCtATCT 

IIIIIIIII II IIIII IIIIIII II II Mill mil II llllllll IIIIII 
306 CTCGCCGCAgOlCCACTGGTTnTrGCAGaAATGCAACTGCrC^ 

IMIIIII IIIIIII IIIIIII IIIIHIIIIIIIIIIIIII Mill MIMMM 

306 CTCGCCGCgcCACCACTcGTTTGTGCAGGJUlTGCAACroCTC^^ 

CTCGCCGC - aCacaVCTg OrmiU ^-GAaTGCAA-TGCTCcATcTACCC - GGtACCATC 



SEP ID NO: 
26 

27 

28 

29 

26-29 



isolate 
T2 

T4 

T9 
USIO 
consensuB 



367 ACTGGACACCGTATGGCATGGGACATGATCATGAACTGGTCGCCCA^ 

IIIIIIIIIIIIIIIMMIIII IIMIIIMMIMIIIIIIIII MIIIIIIMIII 

367 ACTGGACACCGTATGGCATGGGAUlTGATGATGAACTtXrrCGCCa^ 

IIIIIIIIIIMIIIIIIIIMI MIMMMMIMIIIMIIIi lllllllllll 

367 ACTGGACACCGTATGGOlTGGGACATGATGATGAACTGGTCGCCa^ 

M II Mill III IIIIIII MIMMIM Mill II IIIM MM MM Mill 

367 ACcGGgCACCGTATGGGATGGGAO^TGATGATGAACTGGTCGCCCAC^^ 

ACtGGaOlCCGTATOGCATGGGAcATGATGATGAACT GG T CG CCCAC - gCCACcaTGATCC 



SEP ID NO: 
26 

27 

28 

29 

26-29 



T2 
T4 

T9 
USIO 
consensuB 



428 TGGCCn'ACGCGATGCGCGTTCCCGAGGTCATCaTAaACATCaTcgGCGGGTC 

IMIIIIIIMMMMMMIMIIMIIM MMIIII I IMIIII MIIIIII 

428 TGGCGTACGCGATGCGCGTTCCCGAGGTCATCtTAGACATCgTtAGCGGGGCaCACT^^ 

MMMMMMMMMMMMIMMMI lilIMM I Mill II IIIIMM 

428 TGGCGTACGCQATGCGCGTTCCCGAGGTCATCATAGAaiTCATcAG^ 

IIMIIIII IMIIMIIIIIMIIIIMIIIMIIMIIII IIIM II M IIIM 

428 TGGCGTACGtGAT6CGCGTrCCCGAGGTCATCATAGACATCATtJl6CG(^^ 

TGGCGTACGcGATGCGCGTTCCCGAGGTCATCaTAGACATCaT- aOCGGgGCtCAcTGGGG 



gEQ IP NQ; 
26 

27 

28 

29 

26-29 



T4 
T9 
USIO 
consenBus 



4 89 CGTCATGTTtGGCTTGGCCTACTICrCTATGGAGGGAGCGTGGGCGAAgGTCaTTC^^ 

MM I Mill t llllllll I I MM II I M II II M M II II M II I II IMIIII 1 1 

4 89 CGTCA Uiari ' CG GCTrGGCCTA LVltrXVTA TGCAGGGAGCGTGGGCGAAaGlC 

IIIIIIIIIIIM I IIIIIIIIIIIIIIMIIIIMIIIIIMIM II I! MIII III 

489 CGTC Alt»riXX XSCcTAGCCTA LnuriC rATGCAGGGAGCGTGGGaS^^ 

MM IIIMIII MIIIIIIIIIIIIIMIIIIIIIIIIIIIIIII IIIMIIII IM 

489 CGTCtTGTTCGGCtTAGCCTACnTCTCTATQCAGGGABCGTGGGCGAAaGTC^^ 
CGTCaTGTTcGGCtT-GCCTACTTCTCrATGC^GGGAGCGTCGGCG^ 



SEP ID NP: 
26 

27 

28 

29 

26-29 



Isolate 
T2 

T4 

T9 
USXO 
consensuB 



550 CTctTGCTGGCtGCTGGGGTGGACGCG 

II IIIIIM IIIIIIMIIMIII 
550 CTtcTGCTGGCCGCTGGGGTGGACGCG 

M MM IIIIIII IMIMIII 

550 CrgtTGCTcaCCGCTGQcGTOGACGCa 

M MM IIIIIM MIMMM 

550 CTtcTGCTagCCGCTGGgGTGGACGCG 
CTt -TGCTggCcGCTGGgCTGGACGCG 



MVS.l 



wo 96/05315 



PCTAJS95/10398 



17/89 
PX8DRS 10 



SEP IP yg; 

33 



30 
32 
31 
30-33 



?iBglfltC 
T8 

DK8 

SW3 

DKll 

consensus 



GTGGAAGTtASaAACAcCAGTTttAQCTACXACGCCACCA&TGATTGCTCgAACi^ 

11 I I II II II III! mil lllllllllllllllllllllllltl llllllllll 
GTCXVUUmJlGGAACATCAGTTCcAGCTACTACGCC^ 

lllllllllllllllllilllll I Mil I II llllllllllllllllllllll Hill 

GTGGAAGTCAGGAACATCA&TrCTAGCTACTAtGCCACCAATGAT^ 

iiiiiiiiiiiiiiii iiiiiiiii mil iiiiiiiiiimiiimiii imi 

GTGGAAGTCAGGAAOlcCAGTTCTAGtTACTAcGCCACCAATGATTGCr^ 
GTGGAAGTcAGgAACA- CAGTTctAQcTACTAcGCCACCAATGATTGCTCaAACAaCAGCA 



SEP ID KQ: 
33 

30 

32 

31 

30-33 



T8 
DK8 
SW3 
DKll 
consensus 



£2 TCACCTGGCAgCrcACCaACGCACTTCTCCACCTTCCCG^ 

llllllllll II II II llllllllll mmiiii mini mmi III mm 

£2 TCACCTGGCAACTCACCgACGCAGrrcrcaiCCTO 

mmiiiiimiii iiimi mimimmmmm iiimiim 

£2 TCACCTGGCAACTCACCJJiCGC^GTcCTCCACCTTCCCGGATGCG^ 

mimiiimiimiiiim mmmiiiiimmiii imiiiiiii 

£2 TCaCCTGGaUUrTCACaUCGCACTtCTCCACCTTCCCGGATGOT 

TCACCTGGCAaCTCACCaACGCAGTtCTCCACCTTCCCGGATGCGTCCCaTGTG;^^ 



SEP IP PQ; 
33 

30 

32 

31 

30-33 



T8 
DKS 
SW3 
DKll 
conaensuB 



123 OUlTGGCACCtTGCGCTGCTGGATACJUlSTaJUJUi^^ 

iiiiiimi mimiimmim imimiiiiimimimii iii 

123 CAATGGCACCCTGCGCTQCTGOATJUJUUTIGACACCTi^^ 

immiiim immmmimmmmiimmmimmii 

123 tAATGGCACCCTGCACTGCTTXaATACAAGTCUCACCTJ^ 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiimiimiiiiiiiiiimiimimi 

123 cAATGGCACCCTGCACTGCTGGATACAAGTtaJU^ 

CAATGGCACCCTGC - CTGCTXX]ATACAAGTgACACCTJUmnx;GCTGT^^ 



SEP ID WO: 
33 



30 
32 
31 
30-33 



T8 
DK8 
SW3 
DKll 
consensus 



184 GCACtcACTCAcAACCTGCGAACgCAtGTCGACCntjATCCTrAATGGC^ 

mil mil mi I mm ii mmiiimmiiimirmimiiii 

184 GO^tACTCAtiUlCCTGCGAACACACCnCGACGTGATCGTAATGGaUSCT^ 

II II mil iimiiii iiiiiiim iiiimiiimmmmiiiii 

184 GCgCTCACTCAOUlCCTGCGMKrACACGTCGATATGATCGT^ 

II iiimiiiiiiiiiiiiimt I iimiii mmiimimtiimi 

184 GCaCTCACTCACAACCTGCGASCACAtaTaGATATGATtGTAATGGCAGCTAOGGTCTG 
GCaCTcACTCAcAACCTGCGA-CaCA-gTcGA- - TGATcGTAATGGCAGCTACGGTCTGCT 



g^Q IP TO; 
33 



30 
32 
31 
30-33 



T8 
DK8 
SW3 
DKll 
consensus 



245 CGGCCTTGTATGTGGGg^yiCGTgTGCGGGGCCGTGATGATaGcCrrCGC^ 

imiiiiimim mil iiiiiiiiiiiiiiiii i iiiiiiiiii iiiim 

245 CGGCCTTGTATtnXXMAGACGTaTGCGGGGCCCTGATGAT C GTg^ 

mmmimmmi i iniiiiiimiin iiiiiiii iiini iiiim 

245 CGGCCTTGTATGTGGGAGACaTGTGCGGGGCCGTGATGATCGTG^ 

iimmimmiim iimiiiimmmiimiiiimiimm i 

245 CGGCCTTGTATGTGGGAGACgTGTGCGGGGCCCTGATGATCGTGTCGCAGGC^^ 
CGGCCTTGTATGTGGGaGACgTgTGCGGGGCCGTGATGATcGtGTOSC^^ 



wo 96/05315 PCT/US95/10398 



Fzcnms ID 



SEP ID HO: 
33 

30 

32 

31- 

30-33 



IcQlate 
T9 

DXB 

SW3 

DKll 

consensus 



306 ATCGCCaGAACGCCACAACTTcACCCAGGAGTGCAACTCriTCaiTCTACCAJU^^ 

Mini iiiiiiitiiiiii iiiiiiiniiiiiiiiiiiiiniuiiiiiiiiiiii 

306 ATCGCCtGAACGCCACAACTTTACCCAGGAGTGCAACTGTTCaiTCTACCJU^^ 

iiiiii iiiiiiiiiiiiiiiiiiii 1 1 1 II II I II II I n I II I II 1 1 1 1 1 II nil 

306 ATCGCCAGAACGCCACAA LUTlA CCCAAGAGTGCAA CiW ' ItZ CATCTACCAaGGTCgTATC 

iiiitiiiiii nil nniniiiiiiiiiiii iiiii iiiiininiiiii nt 

306 ATCGCCAGAACaCCACcJlCTITACCCAAGACTrGCAACTGTTCCATCTACCJ^^ 

ATCGCCaGAACgCCACaACTTtACCCA- GAGTGCAACTtrTICavrCTACCAAGGTCatATC 



SEP IP TO: 

33 
30 
33 
31 
30-33 



Isolate 
T8 

DK8 

SW3 

DKIX 

consensus 



367 ACCGGCCACCGCATGGCATGGGACATOATGCTgAACTGGTCJtfrCAACI^ 

iiinnniniiiiiiniiiiinnni iiiiniiniiinii niinnii 

367 ACCGGCCACCGCATGG»TGGGA»TGATGCrAAACTGGTCACCAAC^ 

iiiiinnniiiiii inniiniiiniinniinniimniiiiinni 

367 ACCGGCCACCGGATGGCgTGGGACATGATGCTAAACTGGTCACCJU^CTC^ 

iinititiiinini nnnnnnn iiiiiiiniiiiini niiinin 

367 ACCGGCCACCGCATGGCaTGGGACATGATGCTtAACTGGTCACCAACTCrcACCATGATCC 
ACCGGCCACCGCATGGCaTGGGACATGATGCTaAACT GG TCACCAACTCT - ACCATGATCC 



SEP ID WD: 
33 

30 

32 

31 

30-33 



isolate 
T8 

DX8 

SH3 

DKll 

consensus 



428 TCGCCTAcGCtGCTCGT G TgCCTGAaCTAGtCCTtgAa U ' XnViVritXi GCGGCCATTGGGQ 

innn ii nnnn inn nii in i iiiiiinii ininiiiiin 

TCGCCTATGCCGCTCGTGTTCCItaAGCTAGcCCTccAgGTTGTCTTCGGCGGCCAT^ 

I iiiiiiiiniiiiuiiiiiiiiiiii III I iiiiiiiiiiiiiiiiiiiiiii 

TtGCCTATGCCGCTCGTGTTCCTGAGCTAGTCCTTGAAGTTGTCTTCGGCGGCCATT^ 

I n inn nil niiiii iiiiiiiniiniiiiii nnnn ii nnnn 

428 TcGCCTATGCCGCc Cb n m 'X tJC T t^ CTAGTCCTTGAAGTeGT LU ' ItA^ 



428 



428 



TcGCCTAtGCcGCt CGTGTt CCTOAgCTAG tCCTt gAaGTtGTCTIt:GGc(Mc» 



SEP ID WO: 
33 

30 

32 

31 

30-33 



Isolate 
T8 

DKB 

SW3 

DKll 

consensus 



4 89 CGTGGTGTTTGGCTnXSCCTATTTCTCCATGCAaGGAGCTTGGGCCAAA^^ 

niiiininiiiiiiiiiiiiiininin niniiniiitiiiini iinii 

489 arrOGTGTTTGGCTTCGCCrrATrTCTCaiTGCAgGGAGC^^ 

iiii iiiiin iiiiiiiiiiiiiiiiiiini nnnnnnn itniMiiiii 

489 CGTGGTGXTTGGCrTGGCCTATTXCTCCATGCAaGGAGCGTGGC^ 

ni nnin iniinniiiiiiiiiini iiininiiiiintiiniiiiin 

489 tGTGGTGTTTGGCTTGGCCTATTTCTCCATGCAgGGAGCGTGGGCCAAGGTCAT^^ 

c G ' mO ' m ' rX ' iXy GCnTGGC C l' A 'rrrCi'CCATGCA- GGAGCCTGGGCCAA - GTCATtCCCATC 



SEP ID WO: 
33 

30 

32 

31 

30-33 



T8 
DK8 
SK3 
DKll 
consensus 



550 CTCCTcCTTGTCGCAGGAGTGGAcGCA 

inn III nil II nnnn in 

550 CTCCTtCTTGTCGCAGGAGTGGATOCA 

iiiii iiiiiii iniiiiiniiii 

550 CTCCTgCTTGTCGCAGGAGTGGATGCA 

inn inn iiiniiiiiiiiii 

550 CTCCTtCTTGTaGCAGGAGTGGATGCA 
CTCCTtCTTGTcGCAGGAGTGGAtGCA 



wo 96/05315 PCTAJS95/10398 



19/89 
PIODU IS 



SSQ IP 

35 



3S 
37 
39 
38 
35-39 



DK12 
HKIO 
52 
S54 
S52 

consensus 



tTASAGTGGCGGJUlTGTGTCcGGCCTCTAcGTCCrn^CAACGAC^ 

llllllllinilllllll llllllll llllllllllllllllll lllllllllltl 
CTAGAGTGGCGGAAlXm ? TCTGGCCTCT AlXgax;L ' lU ' A CCAAOGACT^ 

iiMiMiiiiiiii Miiiiiiiiiiiiiiii iimii niii i iiniiiiiii 

CTJUaAGTGGCGGAATJ^CGTCTGGCCICT A TtnCCTcACaUl^ ^ 

IIIIIIIIIIIMIIIMIIIIIIIIIIil INI IIIIIIIIIIIIMMIIIIIIIII 

CTAGAGTGGCGGAATACGTCTGGCCTCTATaTCCTTACCAJCGACT G 

iiiiiiiiiiMiiiimmiiiiitii I llllllll 111 imm II II iiiiiii 

CTAGJUntX^CGGAATACGTCTGGCCTCrATgTCCTTACC^ 
C^AGAGTCX;CGGJUlTacGTCtGGCCTCTAC9TCCTtACCAACGACTC3TtC^^ 



35 



36 
37 
39 
38 
35-39 



DK12 
HKIO 
S2 
SS4 
S52 

consensus 



62 TcGTGTATCyiGGCCGATGACGTCATTCTGCACACACCITCC^ 

I Mill II I I II I I I I II II I II I I II I II I I I I t I t I llllllll lllllllllll I I I 
62 TrGTGTATCAGGCCCa^TGACgrCATrCTGCACACACCTCX^^ 

III! II I II II MM III III I IIMMIIIIIIIIIIIIIIIIIIIMIMIIIIIIII 

62 TltriGTATGAGGCCGATCSlCGTtATTCXXSCACACACCIX^^ 

llllllllllllllllllllll IIIMIIIIIIIII IIIIIIHIHIIIIIIIIIMI 

62 TTGTGrATGAGGCCGATGACgrC A TTCT G CACACACCCGGCrGTGT 

MIIIMMIMMIMMIMMIIIIMMIIIIMMIMIHMIMMMMMII 

62 TTGTGTATGAGGCCGATCACGTCATTCTGCACACACCCGGCTGTGTACCT^^ 
TtC7TGTATGJU;GCCGATGACGTcATTCT6CACACACCtOGCTOT^ 



35 



36 
37 
39 
38 
35-39 



DK12 
HKIO 
S2 
S54 
S52 

consensus 



123 OWCAATACATCtACCrroCTGGACCTCaGTGACgCCTACAGI^^ 

IIMMMMM IIIIIIIIIIMII Mill 1 1 M II I II 1 1 II I II II 1 1 M 1 1 1 II 

123 CGGCAATAOlTCCACGTGCTGGACCTCgGTGACACCTACAGTGGCAGT^ 

Ml MIMIIIMIIIIIIIIIII I IIIIIIMMMIIIIIIMIIIIII II I III 

123 CGGtAATACATCCAC G T GC T G GACCCCACrrGACACCTAOUn^ 

Ml MMIMMMMMMIMMIMMMIIMM IIIIIIIIIMIM IIIMI 

123 CGGCAATACATCCA C GT GC T G GACCCCAGTGACACCTACGGTGGCAGTCA^ 

. MIIIIMIIIIII IMIMIMIMIIIIIIIIIIMIIIIIIIMIIIIIilllllll 

123 CGGCAATACATCaitGTGCrGGACCCCftCn^ACaCCTACGGTCGCAGT^ 

CGGcAATACATCcAcGTGCTGGACCcCaGTGACaCCTACaGTtKiCAGTC^^ 



SEQ IP TO; IfifllAU 

35 DK12 



36 
37 
39 
38 
35-39 



HKIO 
S2 
S54 
S52 

consensus 



184 GCAACCACCGCtTCQATACGCABTCATGTGGACCTGcrrACnGGGCGCGGCCACGA 

IMMIIIIM IIIMIMIIMIIMIIIMIII 1 1 M II II II I M II M II II I M 

184 GCAACCACCGCCTCGATACGCAGTCATGTGGACCTOTTASTGGGCGCGGCC^ 

IMIIIIMM MMIIMMIMIIIIIIMII M tlllllMIMMI IIMIII 

184 GaU^CACCGCTTCGATACGCAGTtyiTtmMa^CTATTg^^ 

MIIIMMMMMMIIIMMIMMMMIMM IIMIIIIIIIMI MUM 

184 GCAACCACCGCTTa»TACCCWn^TGTGGftCCTATTAGTGGGCGCGGCCACGC^ 

IIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIilllll 

184 GCAACCACCGCTTCGATACGCAGTCATGTGGACCrATTAGTGGGCGCGGCCACGCTG^ 
GCAACCACCGCtTCGATACGCASTCATGTGGACCTatTaGTGGGCGC^ 



umj 



wo 96/05315 PCT/US95/10398 



PZOURS IS 



35 DK12 



36 
37 
39 
38 
35-39 



HKIO 
S2 
S54 
S52 

consensuB 



245 CTGCGCTCTACGTGGGtGATgTGTGTGGGGCCGTCTTCCTtGTGGGAO^ 

IIIIIIIIIIIMIII III llllll lllllllllim llllllllllllllllllll 
245 CTGCGCTCTACGTGGGcGATATGTGTtXXXSCCGTCTTCCTCGTGGGAC;^^ 

iiiiiiiiiiMiiii llllllllllllll llllll iii iiiiiiiiiiiiiniiiii 

245 CTGCGCTCTAOSTGGGTGATATGTGTOGGGCCGPrCTTTCTC^^ 

iiiiMiiii iiiiiiiiiiiiiiiiiuii iiiiiiii iiiiniiiiiiiiiiiiiii 

245 croCGCTCTATGTGGGTGATATGTOTGGGGCCGTCTTTCTant^^ 

iiiiiiiiiiiiiiMiiiiiiiiiiiiiii iiiiin iiiiniiiiiniiiiiiiiii 

245 CTGCGCTCTOTCmXXnxaiTATGTGTGGGGCCGTCT^^ 

CTGCGCTCTAcGTOGGtGATaTGTGTGGGGCCGTCTTtCTcGT^^ 



SEP ID NO: 

35 DX12 



36 
37 
39 
38 
35-39 



HKIO 
S2 
S54 
S52 

consensus 



306 OtflACC t CGTCGCCATCAAACaGTCCAGACCTGTAACrocrCGCTCT^ 

llllll IIIIIMIIIIIII lllllllllllllllllllllllllllilllllll III 

306 CAGACCgCGTCGCCATaUU^GGTCCAGACCTGTAACTGCTCGCTG^ 

llllll llllllllllllillllllllllllMltllllllillltllllllllll III 

306 CAGACCrCGT C GCCATCAAACGGTCCAGACCTXjrAACTGCTCGCTGTACCCAGGCCOT 

iitiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiitiiiiiiiiiiitiiiiiii 

306 CASftCCTCCTCGCCATCAAACGGTCaUjACCTGTAACTGCTCGCTGTACCawBGCCATCn 

lllllllllilllllllllllllilillllllllllllllllllllllllllllllll II 
306 CAGACCTCGTCGCCATCAAACGGTCCAGACCTCnAACTOCICTCTCT 

CAGACCtCGTCGCCATCAAACgGTCCAGACCTCnAACTGCTCGCrtnACCC^ 



SEP ID NO: 
35 



36 
37 
39 
38 
35-39 



Isolate 
DK12 

HKIO 

S2 

S54 

552 

consensus 



367 TCAGGACATCGAATtXjCTTGGGATATGATGATCAATTGGTCCCCCGCtGT^^ 

IIIIIMIIIIIIIIIIIIIIIIIIIIIIIIilllllllllllMII lllllllllllll 

367 TCAGGACATCGAATGGCTTGGGATATCATGATGAAV^ 

1 1 II I III II I III! II I Mil II II I III! II II II II I III III lllllllllllll 

367 TCMSGACATCXjCIITGGCTTGGGATATGATGATGJUTTGGTCCCCCGCTT^^ 

llllllllll Illlllllllllllllllllllll Ililllllllllll 

367 TCAGGACATCGAATGGCrrGGGATATGATGATGAATTGGTCCCCCGCTGTGGGTAm 

llllllllllllllllllll 1 1 II I II I '!'''!''':' 'I'!'! 'UIMiU 

367 TCWKSACATCGAATGGCTTGGGATATGATGATGAATTGGTCrc^ 

TCAGGACATCGaATGGCTTGGGATATGATGATCAATTGkaTCCCCCGCtC^^ 



SEP ID WO: 
35 

36 

37 

39 

38 

35-39 



DK12 
HKIO 
S2 
S54 
S52 

consensus 



428 TaGCGCACGTCCTGCOtcTGCCCCAGACCTTGTTCGACATAATAGCtGGGGCCCATT^^ 

I IIIIIMIIIIIII IIIIIHIMIIIjllllllllllllll Ililllllllllll 
428 TXKSroCACGTCtnXSCTgTlXKXCCAGACCTIt^^ 

llllllllll Hill llilllllllll IMIlllllllltllllllllllllllllll 
428 TGGCGO^GTtCTQCGtTTGCCCCPJSACC^^^ 

lillMll I Mill llllinilll l INI m ill I lllllllllllllllll 
428 TGGCGO^TCCTGCGATTGCCCaiCSACCrrGTT^ 

MMIIIIIIMIIIIIIIIIIIIIIII MIMIII IIIIIIIII^ 

428 TGOCGCACATCCTGCGATTGCCCCAGACCTlWrnV«ACATACTGGCCGGGK 

TgOCGCACgTcCTGCG- tTGCCCCAGJ^CtTGTTcGACATAaTaOCcOOOOCCCATr^^ 



wo 96/05315 



PCT/US95/10398 



SEP TO; l£fiU£fi 
35 0K12 



36 
37 
39 
38 
35-39 



HKIO 
S2 
S54 
SS2 

consensus 



21/89 
FIGDRS IB 



489 CATCaTGGCgGGCCTAGCCTATTACTCCATGCAGGGCAACrGGGCC:^^ 

II I! n il llllllllllltllllllllllllllllllllllllllllllllllllll 

489 CATCITGGCaGGCCTAGCCTATTACTCCATGCACGGCAACTGGGCCAAGGTCGC^^ 

lllllllll Itlllllltllltlllllltllllllllllllllltlllllllllllllll 

489 CATCTTGGCGGGCCTAGCCTATTACTCCATGCAaGGCAACrGGWCCAAGG^ 

ll lllll lllllllllllllllll II Itlll llllltlllllllllllllllllllll 

489 CATCTTtXKXXXKrCTAGCCTATTATTCTATGaUMGaU^^ 

llllltllllllllllllllllltlllllllllllllllllllltlllltlllllll II 
4 89 CATCTTGGCGGGCCTAGCCTATTATrCTATCCAGGGCAACTGGGCaUU^^ 

CATCtTTCCgGGCCTAGCCTATTAcOTcATGCAgGGCAACTGGGCau^^ 





I 991 Ate 






35 


Dia2 


550 


ATGGTTATGTTTTCAGGaGTCGATGCC 








iiiiiiiiiiiiiiiii lllllllll 


36 


HKIO 


550 


ATGG'l'iA'ilj ri'l'iXJAGGGGTCGATGCC 








iiiiiiiiiiiiiiiiiiiiiii III 


37 


S2 


550 


ATGGTTATGTTTTCAGGGGTCQAcGCC 








III iiiiiiiiiiiiiiiiiii III 


39 


S54 


550 


ATGATTATGTTTTCAGGGGTCGATGCC 








llltlllllllllllllllllllllll 


38 


S52 


550 


ATGATTAlXJnTTCAGGQGTCGATGCC 


35-39 


consensus 




ATQgTTATGTTTTCAGGgGTCGAtGCC 



wo 96/05315 



PCr/US95/10398 



22/89 
PZGDU IP 



STO IP TO; iBgAftt? 
43 Z7 

42 ZB 

42-43 consensus (Z6) 



1 GTcAACTATCaCAATCCCTCGGGCGTCTATCACATCACCAACGA 

II lllllll lllillllllllllllllllll lllllllllllllllllllllllllll 
1 GTtAACTATCGCAATGCCTCGGGCGTCTATCACGTCACaUCaACTGCCC^ 

GTTAACTATCgCAATGCCTCGGGCGTCTATCACgTCACCAACGACTC 



43 27 
42 Z6 
42-43 consensus <Z6) 



62 TAaTGTATGAGGCCGAACACCAOlTCCTACACCTCCCAGGGTGCGTACCCntS^^ 

II iiiiiiiiiiiiiiiiiii III iiiiiiiiMiiiiiii I iimmiiiM 

62 TAGTGTATGAGGCCGAACACCAgATCTTACACCTCCCACXXriGCt^^ 

TAgTGTATGAGGCCGAACACCAgATCtTACACCTCCCAGCXnTSCtTgCCCT ^ 



SfiQ IP TO; lfifiiA£fi 
43 Z7 



42 



Z6 



123 



123 



gGGGAACCAGTCACGCTGCTGGGTn^GCCCTTACrCCCACCGTGGCG^ 

inn inniiiiiiiiiiiiiiiiiiiininniiiiiiii i niiiiniii 

tGGGAAtCAGTCACGCrGCTGGGTGGCCCrrACTCCCACCGTGGCGGtatCTTA 



42-43 consensus (Z6) 



tGGGAAtCAGTCACGCTGCTGGGTGGCCCTTACTCCCACCGTGGCGGtGtC^ 



ggg IP TO; IfiOlAtft 
43 Z7 

42 Z6 

42-43 consensus (Z6) 



184 GCaCCGCTTGAaTCCaTCCGGAGACSlTCrrGGACCTGATGQ^ 

II inniii III innniiiiinniiinnn inn nin ii nil 

184 GCTCCGCTTGAcTCCcTCCGGAtSACATGTGGACCTGATGGTGGGCGCTO 

GCtCCGCTTGAcTCCcTCCGGASACATGTGGACCTGATGGTgGGCGCcGCTA^ 



SEP ID NO; Isolate 
43 27 

42 Z6 

42-43 consensus (Z6) 



245 CcGCtCTCTACaTTGGGGACCTGTGCGGTGGcOtATTtTTGGTTGGtCAGATCTTtTCTTr 

I II I Hill Mil II I nil I II 11 1 I III lllllll I iiiiiiii II II 

245 CtGCCCTCTACgTTGGAGAtCTGTGCGGTGGTGcATTCTTGGTTGGcCAGATGTTCTCCT^ 

f_\ 

CtGCCL"lXrrACg'lUXjGaGAtC"lXjlXjCGGTGGtGcAl'iyriXa>GriliGcCAC^ 



ggg TO; Isolate 

43 Z7 
42 Z6 
42-4 3 consensus (Z6) 



306 CCAGCCGCGACGCCACTGOACTACGCAGGACTGCAArZXrrTCCATCT^ 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii niii i II III n tun i 

306 CCAGCCGCGACGCCACTGGACTACGCAGGACTGCAATTGTTCU^^ 
CCAGCCGCSACGCCACrGGACTACGCAGGACTGCAATTGTTCt^^ 



43 Z7 
42 Z6 
42-43 consensus (Z6) 



367 ACaGGCCAOUSaATGGCATGGGACATGATGATGAACTGGACrrcCa^C^ 

n II linn iiiiiiiiiinniiiiiiniiiiiiiiiiiiiiiiini ii i i 

367 ACgGGCCACAGgATXKKJ^TGGGACATGATGATGAACTGGAGTCCCACA^ 

ACgGGCCACAGgATGGCATGGGACATGATGATQAACTGGAGTCCCACAACC^ 



wo 96/05315 



PCrAJS95/lQ398 



SEP ID NO: 
43 



Isolate 
27 



42 26 
42-43 consensus (26) 



23/89 
FZGDRS IF 



428 TCGCCCAGGTtATGAGGATCCCrABCACTCTGGTgGACCTACTCaCTGGA^ 

llllllllil lllllllllllllllllllllll II mill IIIIIIIIIIIIMII 
42 B TCGCCCAGGTcATGW»ATCCCTAGaurrCTtK?ra^^ 

TOGCCCAGGTcATGAGGATCCCTAG»CTCTGGTaGAtCTACTCgCT^^ 



SEP ID NO: 
43 



Isolate 
27 



42 26 
42-43 consensus (26) 



489 taTCCTTaTcGGGgTGGCaTACrrctGCATGCAAGCTAATTGGGCauUXrrC^^ 

mil I III nil iiiiii I iiiiiiiiiiiiiiiiiiii Mill iimi 

489 CgTCCTlCTKXWtTGGCGTJUnrC^GtA^ 

cgTCCTTgTtGGGtTGGCgTACTTCaGtATGCAAGCTJUTTGGGCau^C 



STO IP TO; 

43 



27 



42 26 
42-4 3 consensus (26) 



550 CTTTTCCTCTaCGCTGGAGTTGATGCC 

imiiiiii iiiiiiiiiiiiiiii 

550 CTTTTCCTCTTCGCTGGAGTTGATGCC 
CTITTCCTCTtC(jCPGGAGTTGATGCC 



wo 96/05315 PCr/US95/10398 



24/89 
PZaURB 10 



45 SAl 



47 
49 
46 
50 
48 
45-50 



SA5 
SA7 
SA4 
SA13 
SA6 
consensuB 



GTtCCCTACCGgAATGCCTCTGGGGTTTAc»rerCACCAATGAcTGCCCJUA 

II llllllll lllllllllllllllll llllllllllllll I II I I III MM I II 

GTCCCCTACCGAAATTCCTCnXKSGGTTTATCATgrCACC^^ 

1 1 III II II II III nil II III llllllllllllll nil Nil Mil 1 1 Mil II 1 1 

GTCCCCTACCCUUWlTQCCrCcGGGGTTTATCATGTCACCAATGAT^ 

II lllllllllll Mill IIIMIIIIIIIIItllllllilllllll IMIIIIIII 

GTTCCCTACCGAAAcGCCTCrroGGGTTTATCJlTGTCACCAATGATTC 

IIIIIIIIIIIIM IMMIIMIIMIIMMMMMIMMMMIMIMMMII 

GTTCCCTACCGAAATGCCTCTKXXXnTTATCATCrrCACCAATGAT^ 

nil! Mill Ml Mill III I II llllllll lllllllllllllllll llllllll 

GTrCCtTACCGgAATGCCTCTSGGGTgTATCATtrrtACCAATGATIXSCCCJ^ 
GTtCCc-TACCGaAAtGCCTCtGGGGTtTAtCATGTcACCAATGAtTGCCCaAACTCtTCCA 



STO IP yO; isolate 
45 SAl 



47 
49 
46 
50 
48 
45-50 



SAS 
SA7 
SA4 
SA13 
SAS 

consensus 



62 TAGTCTACGAGGCTGATAgCCPGATctTGCACGCACCTGGcOT 

IIIIIIIMIIMIIIII llllll IIIIIIIIIIIM II II III I Ml III I I I I 
62 TAGTCTACGAGGCTGATAACCTGATtCTGCACGCAC C TUa '1' i ' bCG ' lt. CCCTSTGT C AaGgA 

IIIIMI llllllll llllllll IIIIIIIMMIIIIIMIIIIIMIMIII I 

62 TAGTCTAtGAGGCTQAcAACCTOATCCTGCACGCACCTGariGCGTGCCCTO 

MM M llllllll MMMIMMM Mill Mill Ml MM llllllll II 

62 TAGTtTACGAGGCTGATAACCTOATCTTGCAtGCACCTOGTIGCGTGCCcTGTGT^ 

I II IIIMIIIIIII IIMIIMII M MMIMIMIIIMI I Mill Mill 

62 TaGTCTACGAGGCTGATGACCTGATCcrrACACGCACCrrGGcrro 

TaGTcTAcGAGGCTGAtaaCCTGATc - TgCAcGCACCTGG tTGCGTGCCcTGTGTcaggcA 



SEP ID NO: Isolate 

45 SAl 123 AGaTAATGTCAGTAGGTGCrGGGTCCAAATCACCCCCy^CAtfr^ 

II IIIIIMIIItlltlllllllllllllllllllllll MIIIIMMII I Mill 

47 SAS 123 AGgTAATGTCACn'AGGTGCTGGGTCCAAATCACCCCCACATrGTCAGC^ 

n I IIIIIMIIIIMIIIIIIIIIIIIIMM\IMIIIIIIIIIIIIIIIIIIIIMIM 

49 SA7 123 AaATAATGTCAGTAGGTGCTGGGTCCAAATCACCCCCACATrGTCAGCCC 

i II II llllllll Mill MM III II llllllll I.I IIIIMI I MM 1 1 MUM 

4 6 SA4 123 AGATAATGTCJtfrrAaGT G CT G GGTCCAAATCACCCCCACgTIt^^ 

I MIIIIIIIM IIIIIIIIIIM IMMIMIII IIII IMIIIII llllll 

so SA13 123 GGgTAATGTCAGTAGGTGCTGGGTCCAgATCACCCCCACACrGTCAGCC^ 

II IIIIIIMIIII IMIIIII II IIIIIIMIIIIII IIIIIIIIIIMIIIMI 

4 8 5A6 123 GGaTAATGTOlGTAGaTQCTGGGTtCAtATCACCCCCACACTaT^ 

45-50 consensus agaTAATGTOlGTAggTGCTGGGTcCAaATCACCCCCACa-TgTXJ^ 



wo 96/05315 PCT/US95/10398 



25/89 
FZOUU IG 



45 


SAl 


184 


47 


SA5 


184 


49 


SA7 


184 


46 


SA4 


184 


50 


SA13 


184 


48 


SA6 


184 


45-50 


consensus 





iiniiiiiiiii iiiiim iiii iiiiiiiMiMiiiiiiiii mil iiiiiii 

GCGGTCACGQCTCCl V nxX A GJUXyStCGTTGACTACTrA^ 

IllllHllllll llillll lllli llllllllll llllllllllllllllllllllll 
GCGGTCJUrGGCTCCTCTTCGOAGGGCCOTroACTACcT^ 

II II I Hill I II II II II IIIIIII II II mil I miimiiiimiimim 

GCGGTCACGGCTCCTCTrCGGAGGGCCGTTGACTACTTAGCGGGAGGGGCTGCCCTCTGCT 

imiiiiiiiiiiimimiiimiiiimiiiiiiiii mmimi mi 

GCGGTCACGGCTCCTCTTCGGJU«X5CCCnTGACTACrrAGCGGGg^^ 

iiimiiiiiiiiiimmiimiim imi mii imi iim mi 

GCGGTCACGGCrcCTCTrCGGAGGGCCGTTGAtTACTrgGCGGGaGGGTC 
GCGGTCACXXSCTCCTCrrCGGAGGGcCGTTGACTACtTaGCGGG 



SEP IP ^iBQlfttg 



45 


SAl 


245 


47 


SAS 


245 


49 


SA7 


245 


46 


SA4 


245 


50 


SA13 


245 


48 


SA6 


245 


45-50 


consensus 





CCGCACTATACGTCGGcGJlCGCGTGCGGGGCAGTOTTtCTGGTAGGCCAAAT^^ 

iiiiiiiiiimm iiiiiiimmimm iiiiiiiiimiiiimm 



nil iiiiiiiiiiiiiiiiiiiiiiiiii niiii iii iiiiiiii IIIIIII III 

'AGGCCAgATCrrrCAgCTA 

imi iimii III 

*AGGCCAAATGTTCACCTA 

II mi l mi l I III 

'AGGtCAAATOrrCACCTA 

lilt iiiiiimiiiii 

'AGGcCAAATGTTCACCTA 



nil niimiiimmniiii 

CCGCaCTATACGTCGGGGACGCGTGCGGGGi 

im minim minim 

CCGCGTTATACGTCGGAGACGCGTGCGGGGi 

mniinimiiiimi imiiiiii 

CCGCGTTATACGTCGGAGACGtGTGCGGGGCAt 




CCGC - cTATACGTCGGgGACGcGTGCGGGGCAgTGTrt tTGGTAGGcCAaATGTTCAcCTA 



45 


SAl 


306 


47 


SAS 


306 


49 


SA7 


306 


46 


SA4 


306 


50 


SAl 3 


306 


48 


SA6 


306 


45-50 


consensus 




SOT ID 

45 


SAl 


367 


47 


SAS 


367 


49 


SA7 


367 


46 


SA4 


367 


50 


SA13 


367 


48 


SA6 


367 


45-50 


consensus 





TAGGCCrCGCCAGCATACcACaGTGCAGGACTGCAACTGTTCCArrr^^ 

m mm nil mil n iniiiinniiiiimiiiiiiiiii iiiiiiiii 

TAGGCCTCGCCAGCATACTACGGTGCAGGACTGCARCTGTTCCAT^ 

imiiiiimiii iiiiiiiniiiiiiiiniiniiiiiiiimi iiinnii 

TAGGCCTCGCCAGOVCACTACGGTGCAGGACTGCAACTGTTCCATCT 

nnmnnnmnmimii ninm n ii mmmimiiin 

TAGGCCTCGCCAGCACACTACGGTGCAaGACTGCAAtTGcTCtATTTAC^^ 

III I III III III I I mil mimi n n iiiiniiiiiiii in 

TAGcCCTCGCCgGCATI^TgttGTGCAGGACTGCAACTGtTCCATTTACAGT^ 

III mini nil i n iiiiiiimiiii iiiiiiiiiimiin in 

TAGgCCTCGCCaGCATgcTacgCTaCAGGACTGCAACTGcTCCATTTACAGlXK; 
TAGgCCTCGCCaGCAtactacgGTgCAgGACTGCAAcTGtTCcATTTACAGtGGC« 

ACCGGCCACCGgATGGCtTGGGACATGATGATGAATTGGTCACCTACGAC^ 

iiiinnin inn iiiiiiniiiiniiiiiiiiiniiiiniiiiiiii iii 

ACCGGCCACa^AATGGCAlX^GGACATGATGATGAATTGGTCACCTAC^ 

iiininniiiniiniinniiiiiininimiiiiniiiiiininiiii 

ACCGGCCACCGAATGGCATGGGACATQATGATGAATTGGTCACCTACGACAGCCrT^^ 

niniimi nmminiiniiiiiniiiiiiiiiinim mm in 

ACCGGCCACCGGATGGCATGGGACATQATQATGAATTGGTCACCTACGACgGCC^ 

inniniiiiimiiiiiiiininiiimnimiinii n ii in in 

ACCGGCCACCGGATGGCATGGGACATGATGATGAATTGGTCACCTACaACAGCtTTG^ 

n nininiininmiinmiiiiniinmm i inn iiiiin 

ACtGGCCACCGOATGGCATGGGACATGATGATGAATTGGTCACCcgCgACAGCcTT^ 
ACcGGCCACCGgATGGCaTGGGACATOATGATGAATTGGTCACCtaCgACaGCcTrc 
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PXOUU 16 

gSP IP yP: l£2lA££ 

45 SAl 428 TGGCCCAGaTGCTAOSGi^TcCCCCAgGTGGTaiTaGACATaiTaGCCGGGGGCCA^^ 

llllllll III nil III mil llllllll llllllll Mill I mil II II II 

47 SA5 428 TGGCCCAGgTGCTACGGATTCCCCAaGTGCrrCATtGACATCATTGCCGGGGGCCAC^^ 

imiiii iiiiiimiiiiiii llllllll iiiiiimmiiiimiiimi 

49 SA7 428 TtXSCCCAGTrGCTACGGATTCCCCAGGTGGTCATCGACATCAT^ 

iMiiiiiiiiMiiiiiiiiiiiiiiiiiiiiimmimimmmimm 

4 6 SA4 428 TGGCCCA UriXiCTA CGGATrCCCCAGGTGGTCATCGACATCATTGCCGGGGG 

iiiiiiiiiii iimiiiiiiiiimiiiii iiiiiiiiimiiii iiimm 

50 SA13 428 TGGCCCAfnTQtTACGGATTCCCaiGGTGCntJlTTtSACAT^ 

nil III II II IIIIIIIIIII III limn III II III nil nil iiniiiii 

48 SA€ 428 TGGCCCJ^TGcTACGaATTCCCCAGGTGGTCATIGACATCATTGCCGGGG 
45-50 consensus TGGCCOlgtTGcrrACGGATtCCCCAgGTGGTCATtGACATCATtGCCGGGGgCa^^ 



S5P TP IfifilAU 
45 SAl 



47 
49 
46 

50 
48 
45-50 



SA5 
SA7 
SA4 
SA13 
SA6 

conaensus 



489 
489 



489 



489 



GGT LTiUri ^GCaScCGCATACTTtGCGTCgGCcGCcAACTGGGCT 

iii iiiiii nil iiiiiiiii mil II II iiiiiiiiiii m in n in 

GGTCTTtnTCGCCGCCGCATACTTCGCCn'CASCGGCTAACTGGGCTJUlGGTTGTGCTGCjrC 

ni iiiiim iii IIIIIIIIIIIIIIIIIII Mill iiiiiiiiijimiMiiii 

GGTCTTSTTCGCCGCCGCATATrrCGCGTCAGCGGCTAACTQGGCIAAGGTTG^ 

mm I II iiiiiiiiiiimiiiiiiiiiiiiiiiiiiiiiiiiiiii i mm 

489 GGTCTTGTTtGCCGCCGCATATrTCGCGTCAGCGGCTAACT^^ 

inn I III I ftTtff*????]?iiT?? 



489 GGTCTTGTTCGCCGCtGCATACTtCGCGTCGGCGGCTAACTGGGCt^^ 

GGTCTTGTTcGCCGccGCATACTtcGCGTC - GCgGCtAACTGGGCtAAGGTt gTgCTGGTc 



SEP ID WO: 
45 

47 

49 

46 

50 

48 

45-50 



SAl 



SA5 
SA7 
SA4 
SA13 
SA6 

consensus 



550 CTGTTcCTGnTGCGGGGGTCGATGGC 

iini iiiiiiuiiiiii 

550 CTGnTCTGTITGCGGGGGTCGATGGC 

llllllllllll llllllllllll u 

550 TTGTrrCTGTTTGCGGGGGTCGATGCC > 



inn 



550 
550 



iiiininiiiiiiiin 



550 TrGTTTCTGTTTGCGGGGGTCGATQCC 

IIIIIIIIIII 

iCGGGGGTCGATGCC 

iiiim mill 

^GGGGGTtGATGCC 
* TGTTtCTGnrrGCGGGGGTcGATGcC 
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FZGtTRS IB 



5EQ ID NO: Genotvpe 

30-33 (IV/2b> 1 GTGGAAGTcAGgAACAtCAGTTctAGcTACrAcGCCACaUTGAT^^ 

34 (2c) 1 gtggaggtcaaggacaccggcgactcxtacatgccgaccjuu:ga 

26-29 (IIZ/2a) 1 GcccAAGTGAagAAO^CAgtacCaGcTAcATtXrrGACcAAcG^ 

35-39 (V/3a) 1 CTTiUSAOTGGCGGMTacGTCtGGCCTCTAtgTCCTtACCAACG^ 

9-25 <IZ/Xb) 1 tAtGAaGT6CgCAACGTgTCCGGG9tgTACCAtGTCAC9AAcGACIG<rrcaulCTca^ 

1-8 (I/Xa) 1 tJlCCAAGTgCGCAACTCcaCgGGgCTtTACCATGTcACCAATGAtTGCCCrJ^^ 

40 (4a) 1 GAGCACTACCGCSAATGCTrCGGGCATCTATCACATCACCAAT^ 
42-43 (4C) 1 GTtAACTATCgaUVrGCCTCGGGCGTCTATOlCgTCACCAACGAC^ 

44 (4d) 1 TACAACTATCGCAACAGCTCGGGTGTCTACCATGTCACCAACG^ 

41 (4b) 1 GTGCACrACCGGAATGCTTCGGGOnrTATCATtrrau: 

45-50 (5a) 1 GTtCCCTACCGaAAtGCCTCtGGGGTtTAtCATGTcACCAATGAtTGCCCaAAC^ 

51 (6a) 1 CTTACCTACGGCAACTCCAGTGGGCrATACCATCTCAauUlTG^ 

1-51 consensus A TA ACAAGATGCAA 

SSQ IP W; QfflOtYPfi 

30-33 (IV/2b) 62 TCACCTGGCAaCTOlCCaACGCAGTtCTCCACCITCCCGGATGCGTCCCaTGTGA 

34 (2c) 62 T CUrri^ GCAOCTTQAAGGftGCA UlTJCriV ATACTCCTGGAT^^ 

26-29 (IIZ/2a) 62 TCACcTGGCAaCTccAgGCcGCGGTcCTCCACGTcCCCGGGTGtgTCCCG^ 

35-39 (V/3a) 62 TtCntrrATGAGGCCGATGACGTc A TTCT G CACACACCtGGCTGT^^ 

9-25 (Il/lb) 62 TtGTGTatGAggCAgcgQACaTGATcaTGCAcACcCCcOGgTGcgrrgCCCTGcGTtCgGGA 

1-8 (Z/la) 62 TtCrrGTACGAGgCgGCcGATgCcATcCTgCA caCt CCgGGgTG TOTc C 

40 (4a) 62 TAGTCtATGAAGCTGACCATCACATCCTACACTTGCCGGGGTGOCn^ 
42-43 (4C) 62 TAgTOTATGAOGCCGAACACCftgATCtTACACCTCCCACXSGTO 

44 (4d) 62 TA GTCTA TGAAACCGATTACCACATCrTACA CC^ 

41 (4b) 62 TAUlU'AJlCGAGACGQAGCACCACATCATOCALl'lt*CCAGGQTGT011JCCC^^ 
45-50 (5a) 62 TaGTcTAcGAGGCTCAtaaCCTG ATctTg C AcGCA CCTGGtTG CGT^ 

51 (6a) 62 TCGTGCTGGAGGnKaATGCTATGATCTTGCATTTGCCTGGATGCTTGCCTTGT^ 

1-51 consensus T A TTCA CCCGTGTCCTaG 

SEP ID NO: Genotype 

30-33 (IV/2b) 123 cAATGGCACCcTGCgCTGCTGGATACAAGTgACACCTAATGTGGCTG^^ 

34 (2c) 123 CGCCAACGTCirrCGATGTTGGGTGCCGGTTGCrCCCAATCrCGCCATA;^^ 

26-29 (III/2a) 123 gGGAAAtaCaTCtCGgTGCTGGATACCGGTctCaCCAAAcGTgGCcGTGCaGCaGCCcGGC 

35-39 (V/3a) 123 CGGcAATACATCcAcGTGCTGGACCcCaGTGACaCCTACaGTGGCAGTCAG^ 

9-25 (II /lb) 123 gaaicAActcCTCccgcTGcTGGGTaGCGCTcaCtCCCACgCTcGCgGC'cAGGAAcgccAg^ 

1-B (I/la) 123 GGgTaaCgcctCGAggTGTTGGGTGgCGgTGaCCCCCACgGTgGCCACduSGGAcGGC^^ 

40 (4a) 123 TGGGAACACATCGCGTTGCTGGACGCCGGTGACGCCTACAGTGGCTGTCGCAC^ 
42-43 (4c) 123 tGGGAAtCAGTCACGCTGCTGGGTGGCCCTrACTCCCACCGTGGCGGtGtC^ 

44 (4d) 123 AGGGAACAAGTCTACATGCTGGGTGTCTCTCACCCCC^ 

41 (4b) 123 GGAGAATAClTCTCGCTCCTGGGTGCCCTTGACCCCaiCTGT^ 
45-50 (5a) 123 agaTAATGTOOTAggTGCroGGTca^TCACCCCOlCatTgTC^ 

51 (6a) 123 CGATGATOXrrCaurCTGTTGGCJlTOCT^ 



1-51 consensus 



TG TGG 



T C CC A T C 



wo 96705315 



PCTAJS95/10398 



28/89 
FZGURS IB 



SEP ID NO: 
30-33 
34 
26-29 
35-39 
9-25 
1-B 
40 
42-43 
44 
41 
45-50 
51 

1-51 

sro IP W; 

30-33 
34 
26-29 
35-39 
9-25 
1-8 
40 
42-43 
44 
41 
45-50 
51 
1-51 

SEP ID WO: 
30-33 
34 
26-29 
35-39 . 
9-25n 
1-8 
40 
42-43 
44 
41 
45-50 
51 

1-51 

SEP- ID HO : 
30-33 
34 
26-29 
35-39 
9-25 
1-8 
40 
42-43 
44 
41 
45-50 
51 

1-5X 



Genotype 
(IV/2b) 
(2c} 
<III/2a) 
{V/3a) 
(Il/lb) 
(I/la) 
(4a) 
(4c) 
(4d) 
(4b) 
(5a) 
(€a) 

consenaus 

(IV/2b) 
(2c) 
(III/2a) 
(V/3a) 
(II/lb> 
(1/ia) 
(4a) 
(4 c) 
(4d) 
(4b) 
(5a) 
(6a) 
consensus 

(IV/2b) 
(2c) 
(III/2a) 
(V/3a) 
(Il/lb) 
(I/la) 
(4a) 
(4c) 
(4d) 
(4b) 
(5a) 
(6a) 
consensus 

(IV/2b) 
(2c) 
(III/2a) 
(V/3a) 
(ll/lb) 
(I/la) 
(4a) 
(4c) 
(4d) 
(4b) 
(5a) 
(6a) 
consensus 



184 GCaCTcACTCAcAACCTQCGAaCaCAtgTcGJlcaTGATcGTAATGGCAGCTJlCGC^^ 

184 GCTCTCACXAAGGGCCTGCGAGCAaUZATCGATATCATCG ^ 

184 GCcCTcACGCAGGGCTTGCGOACgaiCATcGJ^TGGTtGTGAIXnt^ 

184 GCAJVCCACGGCtTCGATJUXSCrJUmjlTGTGGACCTatTaGT^^ 

184 gTCcCcACtAcGaCaATACGACgcCAcGTaSAtTTGCTCGTTGGGGCGGCT 

184 (TrCCCc gCAa C QCAgCTt CQACGTc ftCATC GAtCTGCTtGTcGGgAGcGCC ^ 

184 GCTCCGCITGAGTCGTTCCGGCGACMGTGGACrTAATG(n!AG(KGCGGCCACT^ 

184 GCtCCGCTTGAcTCCcTCCGGAGACAlXnX^GACCrtMTGG^ 

184 GCrcC6CTTGA0TCrrrGAGACGT«CGTGGJ^^ 

184 GCACCGTTiUIAGTCCATGCGCJUXKZATGTiU^A^ 

184 GCGGTCACGGCTCCTCrrCGGAGOGcCGTTOJ^J^tTaGCGGaaGtSgOCtG^ 
184 ACGCC<:GaU^CQGQRTTCCGCAGGCATGTGG Aavnx:nX; CGGGC 



T Q 



T GA 



T G 



GC 



T TG T 



245 CGGC < Sn\^V JatTrGG GaGAC c^ 

245 CTGCCCTTTATGTGGGGGACGTOTtnXvGCGCGCIXaATGCTGGCCGC^^ 

245 CcGCtCTtTACGTGGGGQAc cTCTGCGG cG GGgTgATO CT CGCa GCcCA^^ 

245 C'ltKJOL'iXrrAcGTGGGtGATalVlWtjGGGCCUlVl'X*tLU'cGltj^ 

245 CCGctAroTAcGTGGGgGAt CTCTG CCSGaT CtGTttTC Crcs^ 

245 CGGCCCTCTAcGTGGGGQACtTGTGCGGGTCTGTCTTtCTt(?rCgGtCAaCIt?]7cJ^ 

245 CTOCCCrCT AlOTXtS GGGACCirr Q (:GGAGGTGC LTiV^ ^ ^ 

245 CtGCC CTCXA CgTTGGaGA t(TOTOCX3 GTG QtGc AlUViUXi Q^^ 

245 CCXSC LTiri ' A CATTtXSAC U TCTXntmxa^^ 

245 CCGCgcTATACGTCGGgQACQcOTGCGGGGO^gT gTTtt TGGTJ^ 

245 CATCCCTGTACATCGGGGACCTCTGTtX3 C:iVlVlVn ^ ^ ^ 

C TTATGGGATQGG TT CA T 



306 ATOSCCaGAACgCCACaACTTtJlCCCAaGAGTGCAACTGTTCCATCT;^ 

306 GTCGCCACAACACCATAC Xi ' mii ' iV CAGGAATGCAA C ' iU I ' l ' C CATATACCCGGGCCGCA'IT 

306 CTCGCCGCaaCacCACTgGTTTGTGCAaGAaTGCAAtTGCTCcATcriACCCtG^ 

306 CAGACCCCGTCGCCATOUUCgGTCCAGACCTGTAACTGC^ 

306 cTCgCCtCGcCggcAtgaGACag^aCAGgAcTGcAAc^cTCaaTCTATCCcGGcCac^ 

306 cTCtCCCAGgCgCCaCTGGACtJlCGCAaGaCTGcAAtTGTTCtATCT 

306 TCGGCCGCGTCGCCACTGGACCACGCASGAGTXXJU iriVnV » 

306 CCAGC(:GCGACGCCACTGGAC7ACGCJUKSACTGCAATTGTTC 

306 CCJUlCCra^CCGCCftCTGGACatfXCAAGACTGaU ^ 

306 CCGACasaKTCGGOOXXIACCACCCAGGATrGCAACTGCTC 

306 TAGgCCTCGCCaGCAtactacgGTgCAgGACTOCAAcratTCcATrrACAOCGGCCACATC 
306 TCAGCCCCGCCGTCATTGCUlCIXriGCAAGACTGCAA^ 

CC C CA TO AA TO .TC T TA GG T 

367 ACCGGCCACCGCATGGCaTGGCSACATQATGCTaAACTGGTCACCAACTCTtACCATm 
367 ACGGGACACerGCATGGCTTGGGATATGATGATGAACT G GTCGCCCACTACCACCATC 
367 ACtGGaCACCGTATGGCATGGGAcATQATGATGAACTGGTCGCCCACggCCACcaTGATCc 
367 TCAGGACATCGaATGGCTTGGGATRTGATGATGA AI T GG TCCCCCGCtGTGGGTATGG^ 
367 tCAGGTCAcCGcATGGCtTGGGAtATGATGATGAACTGGTCaCCtACAgCaGCcCTaG^ 
367 ACGGGtCAcCGcATOOCaTOGGATATGATQATOAACrGGTCCCCtACgaCgGCgCT^ 
367 ACCGGCCACAGGATGGCGTGGGACATGATGATGAACIXXIAGCCCTACCACCACTC^ 
367 ACgGGCCACAGgATGGCATGGGACATaATQATGAACTGaAGTCCCACAACCACCeTGcTt 
367 ACAGGACACAQAATGGCTTGGQACATGATGATQAATTGGAGCCCCACTGCG^ 
367 TCGGGCCACAGGATGGCCTGGGACATGATGATGAACIXXSASCCCTACCAGC^ 
367 ACcGGCCACCGgATGGCaTGGGACATGATGATGAATTGGTCACCtaCgACaGCCTTGgTGA 
367 ACCGGCCACAQGATGGCTTGGGACATGATGATGAACTGGTCACCCACAACCACTCTGG^ 
C GG CA G ATGGC TGGGA ATGATG T AA TGG CC C T T 



um_t 
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FIGDRK IH 



SEP ID NO: ggngtYPg 

30-33 (IV/2b) 428 TcGCCTAtGCcGCtCGTCTtCCTXSAgCTAGtCCTtgAaCTtGTCITCGGcTC 

34 (2c) 428 TGGCGTACTTCKnt^CGCATCCCGGAAGTCATCTrGGATATIG^ 

26-29 (III/2a) 428 TGGCCTrACGcGATGCGCGTTCCCGJUXmjmraTAGAC^^ 

35-39 (V/3a) 428 TgGCGCJlCgTcCTOCattTGCCCCAGACCtTGTrcGACATAaTaGCcOOQGCCCAr^^ 

9-25 (Il/lb) 428 T&TCGOlgtTaCTCCGgaTCCCaaWKrTgTCgTGGAcaTCSCT 

1-8 (Z/la) 428 TaGCtCAGCTGCTCcGQaTCCXgCAaGCCaTCTTCGJUATGATTC 

40 (4a) 428 TCGCCOUSATCATOAGGGTCCCCJlCAGCCTTrm 
42-43 (4c) 428 TCGCCCAGGTcATGAGGATCCCrXIU^CACntrrGCT 

44 (4d) 428 TCGCCCAACXTATOAGGATCCCAGGCGCCATGGTCGACCTG 

41 (4b) 428 TGGCItaiGATCTTACGGATCCCCTCT A TCCrrAG^ 

45-50 (5a) 428 TCGCCCAgtTGCTACGGATtCCCGAgGTGGTCATtGACATCATtGCCGGGGgCCACXT^^ 

51 (6a) 428 TATCTASCA lVntj AGG<nACCTGAGA rn\il\i O^^ 

1-51 coneensus TC GTCC TTGGGCA TGGGG 



SEQ IP QengtYPg 

30-33 (IV/2b) 489 c GltA;itirriXi GCnTGGC CiAlTXt. ' li: CATGCAgGGAGC^ 

34 (2c) 489 TGTAA WlTXXi GCCTCGCTTR LTiVX\: CJlTGCAGGGAT^^ 

26-29 (IIZ/2a) 489 CGTCaTGTrcGGCtTaG CCi ' AL T l\r i ClA TGCAGGGAGCGT^ 

35-39 (V/3a) 489 CATCtTGGCgGGCCTAGCCTATTAcTCcATGaigGGCAW 

9-25 (IZ/lb) 489 agTCCTgGCGGGCCTtaccTACTAtTCCATCQtgOGgAACTGOGCt^^ 

1-8 (I/la) 489 A GTCCT aGCGGG CATA G CCrrATTTg rCCATGQtGGGgA A^ 

40 (4a) 489 CGTCCTCQCGGGCTTXKICGXACTTCAGCATGCAAGGCAATTGGGCCAAGGTAG^ 
42-43 (4C) 469 cg TCCtTgTtG GQtTGGCgTA CTrC aQtATGCAA^ 

44 ( 4 d ) 469 CATTCTGCrrrGGCATAGCGTACTTCAGCATGCAAGCrAATIGTC 

41 (4b) 489 Aijl'iVA'lTjL'l\iGlViASLTA"lVlU^JUjC^TGCASAGTAACT 

45-50 (5a) 489 GGTCTTGTTcGCCGccGCATAcTtcGCGTCgGCgGCtJUlCrTGGGCtJ^^ 

51 (6a) 489 GATACTACTASC CmUti CCTA tTA ' i ' l> GCATGG<rrGGCAAC^^ 

1-51 consensus T T G GC T T TGG AA GT T 

SEP IP Wg; ggnQtYPg 

30-33 (IV/2b) 550 CTCCTtCTTGTcGCAGGACTGGAtGCA 

34 (2c) 550 CTCCTGCTGACTGCTGGGGTGGAGGCX; 

26-29 (III/2a) 550 CTttTGCTggCcGCTGGgGTGGACGCO .a 

35-39 (V/3a) 550 ATGgTTATGrTTTCAGGgGTCrGAtGCC ' ^ 

9-25 <II/lb) 550 aTGCTACTCTTTGCcGGcGTtGAcGGg 

1-8 (I/la) 550 CTGtTGCTgTTtgCCGGCGTcGAtGCO 

40 (4a) 550 CTmCCTOTTOCTGGGCTBLGACGCC 

44 (4d) 550 Ll VmVlVX TrG CIGQ ASTaS ACGCT 

41 (4b) 550 CTATTCCTCTTTGCCGGGGTCGAGGGA 
45-50 (5a) 550 tTGTTtCTGTTTGCGGGGGTcGATGcC 

51 (6a) 550 CTGTTCCTAlTrGCJ^GGGGTTGAAGCA 

1-51 consensus T T T C GG (ST GA G 
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SEP ID NO: Isolate 

56 S14 1 YQVRNSTGLVHVTNDCPNSSrVYBtJUJAIIJIaPGCVPCVREGNtSRCT^^ 

52 DK7 1 YQVRNSTOLYHVra^ 

59 USll 1 YQ\nU7STGLYHVTEa>CPNSSIVYEAADAILKTPGCVPCVRSGNaSllCWV3^^ 

55 DR4 1 HQVIUISTGLYHVTKDCPNSSXVYEAACAIIJITPGCVPCVKEGHtSRCWVAVTPT^ 

54 DRl 1 KQ\TOISTGLYHVTTIDCPNSSIVYEAAX>AILHaPQCVPCVREG10^RC^^ 

53 DK9 1 YCr^SSGLyHVm)CPNSSIVySAADAIZilSPGCVPC^ 

58 SWl 1 YQVRMSSGLYHV^^ 

57 S18 1 YQVRNStGLYHVT^^ 

52 - 59 consensufl ygVRNStGLYHVTNDCPN5SIVYBaJU}aILH-PGC\^CVRSgnafirCWVavtPTNrAT^ 

SEP ID KQ: Ififiifl^fi 

56 S14 62 LPatQIJUlyIDXJ.VGSATLCSALYVGDI£GSVFLVGQICTFSPRRlV^^ 

52 DK7 62 iiTaQlJJtHZDUVGSA^ 

59 USXl 62 LPTTQIARHZDLLVGS^ 

55 DR4 62'ii7TQiiRHioi^ 

54 DRl 62 LPTTQUUUlIDIXVGSATLCSALYVGDIiCGSVFLVGQLFTFSPRRHWT^ 

53 DKd €2'LPATQiimiDXXVGSATLCSALYVOT 

58 SWl 62 liAToiiRHIDiiv^ 

57 Sie 62 LPATOUUWiDliVGSATLCSALYVGDUTGSVFLVSQIJTiSPRRHWTlOT 

52-59 consenBUS LP - tQLRHhIDLLVGSATLCSALYVGDLCGSVFLVgQLFTf S PRrhWTTOdCHCSI YPGHI 

56 S14 123 TGHRMAHDHHMimSPTTALVVAQXJJaPOAIU)UZASAHWGVXJ^ 

52 DK7 123 TGHRMAWDMMMRWSPTIALVVAQIJJaPQAILDMZAQAHWGV^ 

59 USll 123 TGKRMAHDHMMNtfSPTaALVVAQiajaPOAIUHIAaAHWGVXiAGZA^ 

55 DR4 123'TQKRjAWDiJJ^^ 

54 DRl 123 TGHRKsiwDui!]^ 

53 DK9 123 ' TGHrJawJaI^ 

58 SWl 123 TGHRMAWDMUMmSPITALWAQIJiRZPQAVLDMIAaAKWGVIJ^ 

IMIIIIIIIIIIIIIIIII mil IIIIIIIIIIIIIIIIIIIIIIM lllllll I 

57 S18 123 TGHRHAWDMMMKWSPTTALViAQLXAvPOAVU)UIASAHWGVIJU^ 

52-59 consensus TGHRHAWDMHHimSPTtALWAQiaJUPQAiLDMIJU3J^^^ 
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SEO ID NO: 




56 


S14 


52 


DK7 


59 


USll 


55 


DR4 


54 


DRl 


53 


DK9 


58 


SWl 


57 


S18 


52-59 


consensus 



/A 



31/89 
FZODU 2A 

184 LLLFAGVDA 

lllllllll 
184 LLLFAGVDA 

lllllllll 
184 LLLFAGVDA 

lllllllll 
184 LLLFAGVDA 

lllllllll 
184 LLLFAGVDA 

Mil till 
184 LLLFCGVDA 

Mil nil . 

184 LLLFSGVDA 

nil III! 
184 LLLFaGVDA 

LLLFaGVDA 



umij 
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SEP ID NO: Isolate 

75 TIO I yEVRH\reGmYHVTODCSNSSIVfEAaDlIMHTEK3CVPCVREgN8SRCWVALT^^ 

62 DKl X YEVRNVSQvYHVTTOCSKSSIVYSAvDvIMHTPGCVPCVREOT^ 

64 HK4 1 hBVhirUGiYHVTinSCSN^ 

76 US 6 1 YKV1Un^SGmYKVTin)CSNSSrVVEAADMIMHTPGCVPCVR^ 

68 INDB . 1 YEVTUnrSCrnrKVTRDCSHSSIVYEAADKIMHTPGCVPCVR^ 

67 IND5 1 YEVRNVSGVYHVTNDCSNSSZVYEAADMIHHTPGCVPCVREGHSSRC^^ 

73 SW2 1 YBVRNVSGVYHVTNDCSNSSIVYETJU)MIMHnH3CVPCVRBaNSSRCW\^ 

63 KK3 1 YEVRHVSGIYHVTNDCSNSSvVYBTJU3MZUHTPGC\n>CVR£NNS5RC^^ 
66 KKS 1 yCVIUTVSGZYHVTIIDCSNSSXVTBTUMIHHTPGCWani^^ 

61 03 1 yEVRlIVSGVYc}\rit!lDC5NSSZVYBTi^MZMZri^^ 

74 T3 1 yEVRNVSGVYyVTNX>CSNSSIVYETADmUtmK;CVPCVRE8NSSRCWV^ 

65 HK5 1 YEVRNVSGVYKVTtndCSNlSZVYBTtDMZMHTPGCVPCV^ 

71 S45 1 yEVRnVSGaYHVTRDCSNSSZVYEAvOvIlKTPOCVPCVRS^ 

72 SAIO 1 ySVRinrSGmYHVTtlDCSNSSZVyEAADmMKTTGCVPCVRSNNSSRCWVT^ 

69 PIO 1 YEVRin4G\r[^^ 

60 Dl 1 YEVRNVSGVYHVTlIDCSNSSIVyEtiaSMZIflnTGCVPOra 

70 S9 1 YEVRllVSGayHV7lroCSNSSZVyBaADvIUHTPGC^n?CVqBgNSSqCWV^ 
60-76 consensus ySVrNVSGvYhVTtroCSNsSiVyBaaDnafnKTPGCvPCVrEnNsSrCWV^a 
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SEO ID NO: 
75 


Isolate 
TIO 


62 vPTTTIRWIVDIiLVGAAAFCSAMyVGDI^SVFLVSQL^ 


62 


DKl 


62 Ipli^iRM^ 


64 


KX4 


62 ipTTTIRWyroiivG^ 


76 


US6 


62 vi4TTiHjyW^ 


66 


IKDB 


62 VPTTTZRRHV^ 


67 


IND5 


62 VsTTTIRhirytilv^^ 


73 


SW2 


62 VPTrriiuUWDLLV^^ 
62 vi44TiRRHTO 


63 


HK3 


66 


HKe 


62 vi4TTiRRH^ 
62 VPTXTIRRI^ 
62 VracTiraHTOLLVS^^ 


61 


D3 


74 


T3 


65 


HK5 


62 VFTTaIRRHVDU\^^ 


71 


S45 


62 Vs44TZiminmUVGAAAFCSAKrVGDI^S^^ 


72 


SAIO 


62 VPTTriRSHVDlXVGAAAFCSAMYVGDUrGSVrLVSQLFTFSPRR^ 


69 


PIO 


62 virrTAIRraVDIiVaAAAFCSAMrTOOLCG^ 
62 VVTIAlJiXVJDU^fGJMFCSMSrvaO 
62 VPlTtlRRKVDUVG^ 


60 


Dl 


70 


S9 


60-76 


consensus 


vpTttIRrHVDIJ*VGAAaFCSaKma)LCGSVfLvSQLFTfSPRrheTVQdCHCSi» 



wo 96/05315 PCT/US95/10398 



SBQ XP ^P; 

75 



£2 
64 
76 
68 
67 
73 
63 
66 
61 
74 
65 
71 
72 
69 
60 
70 
60-76 



TIO 
DKl 
HK4 
OS 6 
IND8 
IND5 
SW2 
HK3 
HK8 
D3 
T3 
HK5 
S45 
SAID 
PIO 
Dl 
S9 

consensus 



34/89 
PXOtmS 2B 



123 ssasQOimbststnitvtsnrj^^ 



123 SGHRMRTOMMMNWSPTTJU-VISQLIAIPOAVVDMWUSAHWCW^ 
123 SGKRhIwDMH^^ 



123 SGHPMAWDMMHHWSPTAALWSQIJjaPOAVMDMVJtfJAHW 
123 SGHRKAm 



123 SGHimAHDMMHNWSPTAALVVSQIJjaPQAVVDMV3USAHHG 

123 SGHRMiwD»»M^ 

123 SOTHftaTOMwl^ 

123 SGHRMiwDKnl^^ 

123 TGHRMAWTOffi^^ 

123 TGHR»tt>^^ 

123 TtnWMAITOMMMrWSPTIALVVSQIJJaPO^ 



123 TCTRMAITOMMMW 

mill 



123 TGHRMAWDMMMNWSPTtJa*WSQLUaPOAIVDMVWJAl^^ 
123 

123 TGHraAWDMM^ 
123 jiiiMAWDftJ©^ 

sGHSMAWDMMMRWS PTaALVvSQLLRi PQAvvDmVaGAHWGvLAGIAYYSMvGNWAICVa^rV 



wo 96/05315 



SEP ID laaUxsA 



75 


TIO 


62 


DKl 


64 


HK4 


76 


US6 


68 




67 


IMD5 


73 


SW2 


63 


HK3 


66 


HK8 


61 


D3 


74 


T3 


65 


HK5 


71 


S45 


72 


SAIO 


69 


PIO 


60 


01 


70 


S9 


60-76 


consenBus 







PCTAJS95/10398 



35/89 
7ZGDRX 2B 



184 mLLFASVDG 

llllllll 
184 ILLFAGVDG 

llllllll 
184 mLLPAGVDG 

llllllll 
1B4 1LLFAE3VDG 

llllllll 
184 KZXFASVDO 

lllllllll 
184 KLLFAGVDG 

lllllllll 
184 MLLPASVDG 

lllllllll 
184 HLLFAGVDG 

lllllllll 
164 HLLFA0VDO 

lllllllll 
184 HLLFAGVDG 

llllllll 
184 ILLFAGVDG 

llllllll 
184 MLLFASVDG 

lllllllll 
184 MLLFASVDG 

lllllllll 
184 KLLFAGVDG 

lllllllll 
184 KLLFAGVXX; 

lllllllll 
184 HLLFAGVDG 

lllllllll 
184 HLLFAGVDG 



mLLFAGVDG 



wo 96/05315 PCT/US95/10398 



36/89 
FZGDU 2C 



gEQ IV TO; 
77 

78 

79 

80 

77-80 



T2 
T4 
T9 
USIO 
conBenauB 



1 AQVrKTax^yMVTiroCSNeSITtfQLOAAVIJKVPGCiPCErlGmSRCWIFVtPKVA^ 

Ml II lllllllll IIIIIIIIIIIIMIt III IIIMIIIII lllllllll 

1 AOVKKTtnSyMVTNDCSKDSZTWQLQAAVIilVPGCVPCBhtC^ 

I Mil llllllllllllllllllllllllllllltl II lllllllllllll II 
1 AeVKNTSTSYMVTllDCSNDSZTWQLQJUlVIilVFGCVPCBrVQNaSRCHIP^ 

I I II III I 11 II I III II I II II I I III Mill I II III I nil II II I II II II I 
1 vqVKZrrSTSYKVrnn)CSNDSITWQXjeAAVIi{\^Cn/PCS]cVGNtSRCWIPVS^^ 

aqVJcOTsteYMVTiroCSKdSITWQIxjAAVLHVPGCNP 



SSO ID NO: 
77 



78 
79 
80 
77-80 



IflolAte 
T2 

T4 

T9 

USIO 

consenauB 



62 ALTQGIJmtlDMVn/MSATXCSftLYVGDI^GGVmAAQWFIVSPrrHWFV^ 

lllllllllllllllllllllllllllllllllllllllllll lllll llllllllll 
62 ALTQGLRtHIDM\nnfSATLCSALYVGDLCGGVKIJU^QMFIV^ 

lllllllllllllllllllllllllllllllllllillll lllllllll llllllllll 
62 ALTQGUmilPMNAWSATLCSALYVGPLCGGVMLAAQWFTiSPQHHW^ 

I I III III I II I II Illlllll INI I III Illlllll II II I II III Illlllll 
62 ALTQGUm{IDU\nmSATICSALYVGDeCGGinMIJUlQMPZvSPrKH8F^ 

ALTQGLRTHZDUVVHSATirSJa.YVGDlCGGvHLAAQHPI vSP - hHwFVQeCN^ 



SEP ID NO: 
77 



78 
79 
60 
77-80 



Isolate 
T2 

T4 

T9 
USIO 
consenBUB 



123 TGHRMAWDMUMinfSPTATmLAYAMRVPEVIiDIigGAHUGVMPGIJ^^ 

llllllllllllllllllllllllllltlll II llllllllllllllllllllli M 

123 rmmiAmiiuts^sPTXTHi^ 

llllllllllllllll lllllllllillll II llllllllllillillllllllllll 

123 TGHmAWDHMHNWSPTtimiAYJUiRVPEVXIDIXSGAKWGVHFGXAYFSMO^ 

llllllllllllllll I nil niiiniiniinni iiniiininniiii 

123 TCniRMAWDMHMNWSPT&TlIIAyvMRVPSVIIDIISGAHWGVlFGIJlYFSMQG^ 

TGHRlAWDMHMNtfSPTaTknIIAYaHRVPEVX IDI i BGAHHGVtoFGIAYFSHQGAWAKVWI 



SEP ID NO: XafilAU,:\ 
77 T2 ^ 



78 
79 
80 
77-80 



T4 

T9 
USXO 
consenBUB 



184 LLLAAGVDA 

niiiiin 

184 LLLAAGVDA 

III inn 

184 LLLtAGVX3A 

III inn 

184 LLLaAOVDA 
LLLaAGVDA 



wo 96/05315 PCT/US95/ia398 



37/89 
PZGDRX 2D 



g^Q IP TO; lAfilA^fi 



82 


DKll 


1 VSVmtSSSYYAT2!n>CSKnSITHQLT!UWLHLPGCVPC3^ 

lllll IIIMMIIIII llllllllllllllllllllllllllllllllllllllllll 

1 VEVIU7iSSSyYATmCSN8SITWQLtKAVIJILMCVPCSKDmniiK^ 

Mill 1 III Mill II III iiiiiii lllll nil II lllll nil II nil III II 

1 VEVfWtSfSYYATNDCSHNSITWQLTIAVUILPGCVPCSHDNGTIJim^ 

lllll 1 nniniininiiii iiiiinnniniiniininniiiiiii 

1 VEVmiSBSYYATmsCSimSXTWQLTdAVLHLPGCVPCEXIDNGT^ 


83 


SW3 


64 


T8 


81 


DK8 


81-84 


consenfius 


VSVRN - SsSYYATNDCSMnSITWQLTnAVLHLPGCVFCBNDRGTL - CWIQVTPNVAVKHRG 



SEP ID WO: 
82 



83 
84 

81 
81-84 



iBolatg 
DKll 

5W3 

T8 

DK8 

conseneuB 



62 ALTHZnJUUiiDHZVMAATVCSALYVGDvCGAVMIVSQAPZvSPEhHhFT^ 

iiiiniii niiininiiiini innniiiii in i iiiiiniiiii i 

62 ALTHmJUHVDHZVMAAT\rcSALYVGDmCGAVKZVSQAFZZSPBRHNFTQECNC^ 

iiiiiii III iMiiiiiiiniii iiini iiiiniiiiiniiiiiiinii i 

62 ALTKKUmra3VIVMAATVCSAX.7raOVCGAVMZa50AFZZSPSRKNFT^ 

iiniiiiiniiiiniiiiiiiniiiniii in iiiiniiniiiiiiiiini 

62 ALTKmjmtVDVZVMIUTVCSALYVGDVCGAVHZvSOAlZZSPBR^^ 

ALTHOTJl-HvD- IVMAATVCSALYVGDvCGAViavSOAfliSPBrHnFTQECTCSIYQ^ 



SEP ID NO: iBQlftte 
82 DKll 



63 
64 
81 
81-84 



SW3 
T8 
DK8 
consensus 



123 TGKRMA1«DMMLNWSPTLTmiAYAAR\^SLVI.BVVF^^ 

iiinnniiiiiiiiiiiiii lllll III It ininiiiniiiinniininii 

123 TGHRUAmMHL£nrSPTX«7mLAYAARVPBLVI£VV^^ 

iiniiiiiiiiiiiiiiiiinniniinMiiniiiiiiiiiiiiiiiiiiiini 

123 TGHI^WX3MHZllWSPTLTmiJVYAAKVFBLVZiBVVFG6KWGVVFGIA 

niiiiiiiiniiiiininiiiinn i iiniiiiiiiniiiiiiniiniii 

123 TGHI^WDMMLXfmSPTLTKZIJlYAARVP&lAX^^ 

TG»RMA«DMKLllWSPTLTHIIJVYAARVPBLvLeVVFTO 



SEP ID NO: 
82 

83 

84 

81 

81-84 



DKll 
SH3 
T8 
DK8 

consensus 



184 LLLVAGVDA 

lllllllll 
184 LLLVAGVDA 

lllllllll 
184 LLI.VA6VDA 

Miiiini 

184 LLLVAGVDA 
LLLVAGVDA 



wo 96/05315 PCT/US95/10398 



38/89 
7ZGDR8 2S 



SEO ID NO: 
66 


Isolate 
DK12 


1 


87 


HKIO 


1 


88 


S2 


1 


90 


S54 


1 


89 


S52 


1 


86-90 


consenBUB 




SEO ID NO: 
86 


iBOlate 
DK12 


62 


87 


HKIO 


62 


86 


S2 


62 


90 


S54 


62 


89 


S52 


62 


86-90 


conBenauB 




86 


DK12 


123 


87 


HKIO 


123 


88 


S2 


123 


90 




123 


89 


S52 


123 


86-90 


consensus 





LBWWIVSGLyVLTOTCsNSSIVyEADDVlIJITPGCVPCVQDGNTSTCWTSVW 

llllllllllllllll llllllllllllllllllllllllllllllllllllllllllli 
LBWRHVSGLYVLTODCpNSSrVYKADDVlUfTPGCVPCVQDGm*STC:OT 

mil llllllllll lllllllllllllllllllllllllllllllt lllllllllll 
I*EWRin'SGLYVLTNDCSNSSIVYEW)DVIUrrPGCVPCVQDGNTSTCWTI^^ 

llllllllll lllillllllllllllllllllllllllllllllllltllllllllllll 
LEWWrrSGLYiLTHDCSNSS rVYBADDVILHTPGCVPCVODGirrSTCWTPVrPTTO 

ill III I III lllllllillllllllillllllllllillllll I lllllllllll III 

LBWIUrrSGLYvLTin^CSNSSIVYEADDVZLHTPGCVPCVQDGNTSmCWTPVTPTm 
I£WIWtSGLYvLTNDCBNSSIVY£ADOVZUrrPGCVPCVQDGNTStCWT^VT^^ 



ATTASIRSHVDIAVGAATOCSALYVGDvCGAVFLVGCAFTFRPIUW Q T ^ 

iiiiiiiiiiiiiiriiiiiiiiiiii null Mill II III III I nil II I Mill I 

ATTAS IRSHVDIXVGAATMCSALYVGDMCGAVFLVGQAFTFRPRWt Q T V QTCNCSLYPQm 

iiiiiiiiinnniiiiiiiiiinniiniiiiniinninniiniiiiiii 

ATTASIRSHVDU^VGAATMCSALYVGDMCGAVFLVGQAFTniPRiUIQTV^^ 

iiiiiiiiiiiiiiiiii iiniiiiiiniiiniiniiiiiiiiiiiiiiiiiiiii 

ATrASIRSHNn)IJ«VGAATLCSALYVGCUCGAVFLVGOArrFRPR]^ 

ninnninnniiniinnnninnninnnniiinniiiiiii 

ATTASIRSHVDIXVGRATUrSALYVGDMCGAVFLVGQAFTFRPWWQT V ^^ 
ATTASIRSHVDIXVGAATtnCSALYVGDmCGAVFLVGOAFTFRPRm 



SGHRMAWDMMHNWSPAVGKVVAHVLRLPgTLFDIIAaAHHGIii^ 

ininniiiiniiiiiinniniinnnniiin iiiiniiininini 

SGKRMAWDMMtOTHSPAVGMVVAHVUUiPQlIOTIIAGAHHGIIJUSlA 

iniiniiiiniiniininiinii iniiiniiiiiiiniiiiiniiiiii 

SGKRMAWDMHMNWSPAVGHVVAHVIJaPQTvFDIZAGAHWGZZM 

nininiininniiiiii iiini in iiiiiiiniiiniiiiniiiiii 
II n I n II n I n I n 1 1 1 n I II n in n II II n II I II n I! It 1 1 II II n 1 1 1 

SGHRKAWDUMHNWSPAVGHVVAKIUUJ>QTLFDXXJUSAHWGILAGI^ 



SEO ID NO: 



86 


DK12 


184 


MVKFSGVDA 

Illllllll 
KVMFSGVZ3A 

iiiiiini- 


87 


HKIO 


184 


88 


S2 


184 


HVMFS6VDA 


90 


S54 


184 


1 iiiini 

MIHFSGVDA 

Illllllll 
HIMFSGVDA 


89 


SS2 


184 


66-90 


consenBUB 




MvMFSGVDA 



wo 96/05315 



PCT/US95/10398 



39/B9 
PXOURE 2P 



SEP ID NO: l£filA£fi 
94 Z7 

93 Z6 

93-94 consensus (26) 



1 VNYhHASGVYHiTNDCPNSSImyEAEHKILKLPGCVPCVReGNQSRCWVALTPTV^ 

III MIIIM lllllllll lllll iillMI III! Illllllllllllll III 

1 VNYWU^GVVHVTiroCPNSSIVyKAEHqlLHLPGClPCNnivGNQSRC^ 
VNyrKASGVYHvTTOCPNSSIvYKAEHqllimPGClPCVRvGNQSRCW^ 



94 Z7 
93 Z6 
93*94 consensus (26) 



62 APLESiRSHVDLMVGAATVCSALYIGDLCGG\^VGOHFSFOPRIU^^ 

III I llllllllllllllllll lllll! IMIIIIIMIIMIIIMIIIIIllll 
62 APLdSIJUa{VDI2f\rGJUlTVCSALYvGDLCGGaFLVGQH^ 

APLdSlRimVDUfVGAATVCSALyvGDirGGaFLVGQMFSFQPRR 



SEP ID HO: igglfltg 
94 27 

93 26 

93-94 consensus (Z6) 



123 TGHRHAWDMMHNWSPTtTLvIJiQVKRZPSTLVDLLTGGHHGiLiGvAYFcMQJ^^ 

lllllllilllllllllll Illllllllllllll lllll I I III lllllllllll 
123 rooRisMiDmaosnfSVTrrLii^^ 



SEP ID NP: IfifilAU 
94 Z7 

93 Z6 

93-94 consensus (26) 



184 LFLyAGVDA 

III Mill 

184 LFLFAGVDA 
LFLfAGVDA 



utnj 



wo 96/05315 



PCT/US95/ia398 



40/89 
PIOOU 20 



SEP ID NO: IfifilA&fi 
98 SA5 



100 

97 

96 

99 
101 
96-101 



SA7 
SA4 
5A1 
SA6 
SA13 
consensus 



VPYRKASCnmiVTNDCPNSSIVYEADNLIlJiAPGCn^CVkegin^ 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIil lllllllllllllllllll 

VPYTlNASavyHVTMDCPNSSIVYEADNLXIiIAPGCVPCVRQnm;SRC^ 

lllllllllllllllllllllllllllllllllllllllll III lllllllllllltll 
VPYHNASGVYHVTTroCPNSSZVY&ADinaLKAPGCVTCVRQONVSk 

IIIIIIIIIIIIIIIIIIIIMIIII lllllllllllltlllll llllllllllll I 
VPYRNASGVYHVTNDCPHS SIWEADaLILHAPGCVPCVRQDNVSRCWVQITPTLSAP t f G 

llllllllllllllllllllllllll lllilllllllll llllllll llllllll I 
VPYKNASCnrniVTroCPNSSIVyEADDLILHAPGCVTCVRJc^ 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIMIII lllllll lllllllllll 

VPYIUU^6VyH\mn>CPNSSIVYSADDLIUlAPGCVPaniq9N\^RC^ 
VPYRHASGVYHVTNDCPNSSIVYKADnULHAPGCVPCVrqdr^ 



S6Q IP TO; IffilUfi 
98 SA5 



100 

97 

96 

99 
101 
96-101 



SA7 
SA4 
SAl 
SA6 
SA13 
consensus 



62 AVTJ^UUlvVDYIAGGAALCSUYVGDACGJlVTLVGQHFtYRPRQHI^ 

llllllll lllllllllllllllllllillllllllll lllllllllllllllllilll 
62 AVTAPIJWlVDYIAGGAALCSJU*YVGDACGAVFLVG<y^ 

IIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII llllllllllillllllllll 

62 avtapxaravdyiju;gaalcsalyvgdacgavflvgoispt^ 

I II 1 1 III Mill llllllll lllllll II II nil I III I lllllllllllllllllll! 

62 AVTAPXJUUVVDYZAGGAALCSJarVGDACGAVFLVGQMFTm 

III III II II III lllllll null I IN III I Hill mil iiiiiiiiiiiiii 

62 AVTAPIJ«AVDYIJUXUUarSALyVCn>vCGAXFI.^^ 

IIIIIIMtlllllllllllMIIIII III lllllllil II I lllilllllllll 

62 AVTAPLRRAVDYLAGGAALCSALYVGDaCGAvFI.VGQHFTy6PRrKnvV0DCI^ 
AVTAPUWaVDYIJtfK^AAIXrSALYVGDaCGAvFLVGQM^ 



SSO IP TO; 

98 



100 

97 

96 

99 
101 
96-101 



SA5 
SA7 
SA4 
SAl 
5A6 
SA13 
consensus 



123 TaiHFISMa>lSUtWyfS7TrMs^/^^ 

IIIIIIIMIIIIIIIIIIIIII Itlllllllllllilllllll lllllllllllltll 

123 TGHRMAWDKMHHWSPTrALVHAQLIJlIPCV\aDIIAGGHWC^FAAAYFASAAI!^^ 

I I III I I II III I I IJI I I lllilllllllll II III II llllililllllllllll Ik 
123 TGHRMAWDHMHtmSPTrAIJ21AQU4RIPQVVIDIIAGGHWGVLFAAAYFASAAHW^ 

IIIIIIIMIIIIIIIIIIIIII llllllllllllllllllllllllllllllllll II 

TGKRHAWDMMMNWSPTTALIJMA^fLRIPQVVIDIXAGGHWGVX^ 

IIMIIIIIIIIII III lllllllllllllllllllllllllllllllllllllllll 
TGHRHAWDMMMriWSPaTALVMAQKLRIPQV\aDZIAGGHWG\n:«FAAAYFASAANWAKV^ 

I II II I I II I INI I lllllll lllilllllllll II II II INI llllllllllll 
123 TtSKRMAWDMMHNWSPtTALVHAQlIJUPQVVIOZZAGaHWGVLFAAAYyASAAW^ 



123 
123 



TGKRMAWDM^SHI!mSPtTALvMAQlLRIPQWIDIIAG9KWCn^FAaAY£^^ 



wo 96/05315 



PCT/US9S/1039S 



41/89 
7ZGDRS 26 



SEO ID Mn. 
98 








SA5 


184 


LFLFAGVDg 


100 


SA7 


184 


llllllll 
LFLFAGVDA 


97 


SA4 


184 


tllllllll 
LFl^FAGVDA 
llllllll 
LFLFAGVDg 


9fi 


SAl 


184 


99 


SA6 


184 


llllllll 
LFLFAGVDA 


101 


SA13 


184 


lllllllil 
LFLFAGVDA 


96-101 


consensus 




LFLFAGVDa 



wo 96/05315 



PCTAJS95/10398 



42/89 



PIGORS 2H 



SEO ID NO: 






61 


-64 


(IV/2b) 


1 VEVRNiSBSryATNDCSNnSXTWOLTxiAVIig*PGCVPCE«DKGTl^CWIQVTPN^ 




85 


(2c) 


1 VEVKDTGDSYHPTNDCSNSSIWQLEGAVL»rPGCVPCERTAMV5RCW\n>VM 


77 


-80 


{III/2a) 


1 aqVkOTstBYMVTNDCSNdSITWQUsAAVUiVPGCvPCSlcvGNtSRC^ 


86 


-90 


(V/3a) 


1 LEWRNtSGLYvLTNDCflNSSIVYBADDVlIJrrTGCVPCVQDGWTStCWTtJVTPT^ 


60 


-76 


(Il/lb) 


1 ySVrNVSGVYhVTNDCSKsSiVyBaaDnamKTPGCvPCVrBnNBSxCWVALtPTIJJ^ 


52 


-59 


(I/la) 


1 yOVRKStGLYHNnWCPNSSIVYBaADalLKBPGCVPCVREgnaBrCWVavtPTVATRDGK 




91 


(4a} 


1 EHYRNASGIYHiraDCPNSSIVYSWHHILHLPGCWCn^MrONTSRCWTPVTPTV^^ 


93- 


94 


(4C) 


1 VKYrNASGVYHvTODCPNSSIvYBABKqILKLPGClPCVRvGNOSRCWNrU.TPT^ 




95 


(4d) 


1 YHYRNSSGVYHVTOTCPNSSIVYETDYHIIJaPGCVPCVRBGHlCSTCWVSLTTT^^ 






(4b) 


1 VKYRHASGVyHVTIIDCPNTSIVyETEHHIllHLPGO^CVRTEKTSRCWVPLTPT^ 


96- 


101 


(5a) 


1 VPYRNASGVYHVTNDCPNSSrVYEADnLILHAPGCNffCVrqcUIVSrCWqlTPTL^^ 




102 


(6a) 


1 L7TGNSSGLYHI*TiroCPKSSIVI£ADittIILHLPGCLPCVR\roDRSTCnTO^^ 


52- 


102 


consenBUB 


Y TNDC N S H PGC PC CW P 


SGO 


IP r^o: 


Genotvoe 




81 


-84 


(IV/ 2b) 


62 ALTHHLRtHvDmrVMAATVCSJa.yVGDv(:X3^^ 




85 


(2c} 


62 i^TICGLRAHIDIIVMSATVCSJa.YVGmn:xSJa2SZJ^^ 


77 


-80 


(lll/2a) 


62 ALTQGLRTKI0MVVKSATLCSAI.YVGDlCGGvHZJU<2HFIvSPqhHwP^ 


86 


-90 


(V/3a} 


62 ATTi^IRSHVDIXVGAATfaCSALYVGDmCGAVFLVGQRFTFRPRRHqTVQ^ 


60 


-76 


(II /lb) 


62 vpTttIRrHVDUVGAAaPCSaMyVGDLCGSVfLvSQICTfSPRrhel^rQdCKCSiYP^ 


52 


-59 


(I/la) 


62 LPatQIJUttiIDU.VGSATirSJa.yVGDLCGSVFLVgQLFTfSPRrhfr^ 




91 


(4a} 


62 API^SFRRKVDUSVGAATLCSJOiYVGDLCGGAFLICQHXTFRPRRinr^ 


93 


-94 


(4C) 


62 JU»LdSlRRHVDIJSV0AATVCSALyvGDLC6Gan.VGQMF3PQPRRI^^ 




95 


(4d) 


62 APIJSLRRHVDLMVGGATLCSALYIGDVCGGVFLVGQLFTFQPRRHKnVOC^ 




92 


(4b) 


62 API£SMRRH\n3U!VGAATMCSAFYIGDUXK;VFLVGQXJl)FRPR}^^ 


96- 


101 


(5a) 


62 AVT»IJUl&VDYLAGGAALCSALYVGDaCaAvFLVGQMPtYrPRqKttVQDCNCSIYSGHI 




102 


(6a) 


62 TPATGFRRK\miJJUIAAVVCSSLYIGDLCGSLFIJlGQLFTFQPRRKWTVQ 


52- 


102 


conBensufi 


RD AC5YGDCG Q P Q OtCS Y G 



5e;o id 








61 


-84 


(IV/ 2b) 


123 


TGHRMAWDMMIimSPTLTmLAYAARVPELvLaVVFGGHWGVVFGIAYFSMQGAWAKV^ 




85 


(2C) 


123 


TGHRMAWDMHMNWSPTTTOI^YLViaPCVILDIVTGGHWGVMFGLAYFSMQGSWA 


77 


-80 


(lll/2a) 


123 TGHRMAWDMMMNWSPTaltnZLAYaMRVPEVIiDIiBGAHWGVknFGIAYFSMQGAWAKVv^ 


86 


-90 


(V/3a) 


123 


SGKRMAHDMMMNWSPAVGMVVAHvIJUJ»QTlFDXiAGAHWGXlAGIAYYSMQGNWAl^^ 


60 


-76 


(Il/lb) 


123 


sGKRHAWDMMHNWSPTaALVvSQLLRiPOAvvDmVaGAHWOvIJlGZJlY^ 


52 


-59 


(I/la) 


123 


TGHRMAWDHMHNWSPTtALVvAQlJJUPOAiLOHIAGAHWGVIJ^ 




91 


(4a) 


123 


TGHRtlAWDKMHNWSPrrruaJ^IHRVPTAFUnfVAGGH^ 


93 


-94 


(4c) 


123 


TGHRM^lWDMMMNWSPTTIXlIiAQVMRIPSTLVDLlAGGHHGvLvGl^^ 




95 


(4d) 


123 


TGHRMAmHMHNWSPTATLVLAQUIRIPGAHNmLXJU^GHHGI 




92 


<4b) 


123 


sghrhawdmmhnwsptsalzmaqilripsii/;dixtgghiigvij^laffshqsi^^ 


96- 


101 


(5a) 


123 


TGHRK;VWDHHHNWSPtTALvMAQlLRXPQVVIDIIA69HHGVIJrAaAYfASAANH^^ 




102 


(6a) 


123 


TGHRMAWDMMHNWSPTTTLVZ^SIIJlVPElCASVIFGGHWGILLAVAyFGKAGN^ 


52- 


102 


consenus 
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SEO ID NO: 


Genotvoe 






Sl-84 


(IV/2b) 


184 


LIXVAGVDA 


65 


(2C) 


184 


LLLTAGVEH 


77-80 


(Ill/2a) 


184 


LLLaAGVXm 


86-90 


(V/3a) 


184 


MvMPSGVDA 


60-76 


(Il/lb) 


184 


nUXPAGVZXS 


52-59 


(I/la) 


184 


LLLPaC3VX3A 


91 


(4a) 


184 


LFLFAGVDA 


93-94 


<4C) 


184 


LFXfAGVDA 


95 


(4d) 


184 


LFLFAGVDA 


92 


(4b) 


184 


LFLFA6VB0 


96-101 


(5a) 


184 


LFLFAGVDa 


102 


(6a) 


184 


LFLFACTVBA 


52-102 


consensuB 
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Mm.i 
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FIGURE 4 
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0K9 


59 


US11 


C C 


0R4 


68 




72 


dMIU 


69 


• 1 w 


/ u 


CQ 


fl7 




86 




89 




9l 


Z4 


93 


Z6 


9^4 


27 


95 


0K13 


92 


21 


96 


SA1 


100 


SAT 


102 


HK2 


85 


SS3 


79 


T9 


78 


T4 


80 


US10 


83 


SW3 


82 


0K11 


84 


T8 



1/1 a 



11/1 b 
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4e 



4d 



4b 



6a 



2e 
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48/89 



FZODU €A 



SEQ ID KO: 

108 

103 

104 

105 

106 

107 



ISOLATE 
DR4 
DK7 

nsii 

S14 
SWl 
S18 



103-108 consensus 



I ATGAGCACGAATCCTWUUZCTCAAAGAAAAACCAAACGTAACACC^ 

1 ATGAGCACGWOXZCTJUU^CTCAAAaAAAAACCAAACGT^^ 

1 ATGACC»CQAATCCT3JkACCTCAAAGAAAAACCJUlACC?rAA 

1 ATOAGCACGJUlTCCTAJVACCTaUJUiAAAAACCAAACGTWl 

1 ATGAGCACGAATCCTAAACCTCAAAGAAAAACCAAftCGTAACftCCAACCGTCGC^ 

1 ATGACau:aAATCCTJUUCCTCAAAGAAAAACCAAACGTAACACCAAC 

ATOAGCACgAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCG^ 



SEQ ID NO: 

108 

103 

104 

105 

106 

107 



ISOLATE 
DR4 
DK7 

nsii 

S14 

SWl 
S18 



103 - 108 consensus 



62 ACGTCAAGTrcCCGGGTGGCGGTCAGATCGTTOT 

62 ACgrCAAGTTCCCGGGTGGCGGTCAGATCVlTGGTGQA GTn'ALlUtjllG CCGCGa^^ 

62 ACgrgJCTTCCCGGGTGGCGCnrCAGATCXjn'rTCTG^ ^ 

62 ACGTCAAGTTCCCCXyntXXrGGTCASATCGTTGGTGQA GTTTAL"!'!^^ 

62 ACGTCAAGrrCCCQGCn^CGGTCAGAl\Xjri13GTGGA GTITACTTg 

62 ACGTtAACnrCCCGGGTGGCGGTCWaiTCUri^MTGGW^ 

ACGTcAAGTTCCCGGGTGGCGGTCAGATCVlTCKSTGGACnTrACn^^ 



SEP ID WO: 

108 

103 

104 

105 

106 

107 



DR4 

DK7 

nsii 

S14 

SWl 
S18 



103 • 108 consensus 



123 CCCTAGATTGGGTCntSCGCGCGaCGAGGAAGACTTCCGAGCGGTCGCAACCTCGA^ 

123 CCCTAGATTGGGTGTGCGCGCGcCGAGGAftGACTTCCGASCTGT^^ 

123 CCCTAGATTGGCmrreCGCGCtaUMJUXlAAtaACTrc 

123 CCCTAGATTtMGTXntSCGCGCGACGAGGAftGWnrcCGAGCOT 

123 CCCrAGATTGGOTGTGCGCGCGACGAQGAAOACTTCCGACCGGra 

123 CCCTAGATItKKnGTGCGCGCGACGAGGJUUyurrTCCGA^^ 

CCCTAGATTOGGTGTGCGCGCGaCGAGCaUtfSACrrcayUSCGGT^ 



108 
103 
104 
105 
106 
107 



DR4 
DK7 
USll 
S14 
SWl 
S18 



103-108 consensus 



184 CGTCAGCCTATCCCCAAGGCgCGTCGGCCCGAOGGCAGGACCTGGOCTCAGCCC^ 

184 CCnOVGCCTATCCCOUtfKSCACGTCXKKrCaiAGGGCAGGACC^^ 

/184 COTCAGCCTATCCCOU^KJlCGTCGGCCCGAGGGCAGGACrT^ 

184 CGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGACCTGGGCTC^ 

184 CXntJUXZCTATCCCCJUUWCGCCnCGGCCCGA^ 

184 CGTCAGCCTATCCCCAAGGCGCGTCGGCCCGAGGGCACGACCTOGGCTCACrc 

CGTCAGCCTATCCCCAAGGC • CGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTAcC 



108 
103 
104 
105 
106 
107 



DR4 

DK7 
OSll 
S14 
SWl 
S18 



103-108 consensus 



245 cmaSCreCTOATGGCIUlTGAGGGCTCCGGGT^ 

245 CTTlK^CCCCTCTATGGCAATtSAGGGCTCSCGGGTGGGCGGGATGGCTCCTro 
245 StOgScCTCTATG^ 

245 criGGCCCCTCTATGGCAATGASGGCTGCGGGTGGGCOG»^ 
245 CrTGGCCCCTCTATGGCAATGAGGGCTGCGGaTCTC 

245 CTTCGCCCCTCTATGGCAATOAGGGCTOCGGgTGGGCQGGATGGCTCCTGTCCCCC 
CTIGGCCCCTCTATCGCAATGAtKX^^ 



Sfig IP TO; 

108 
103 
104 
105 
106 
107 

103-108 



ISQIATE 
DR4 
0K7 
USll 
S14 
SWl 
S18 

consensus 



306 CTCTCGGCaCAGCTCKX»CCCCACAGACCCCCGG^ 

306 CTCTCGGCCTAGCTGGGGCCCCACACaACCCCCCSGCGcAK^^ 

306 cTcrcGGcca^crGGGGCccaau^ 

306 CTCraS<XCrAGCIGGCGCCCrA^ 

306 CTCcCGGCCTAGCTGCKKKrCCTACAGACCCCCGGCaTAGG^ 

CTCtCGGCCTAGCTGGGGCCrcACaGACCCCCCXSCGtAGGTCGCGCA^ 
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108 
103 
104 
105 
106 
107 



DR4 

DK7 
DSll 
S14 
SWl 
S18 



103-108 consenBUB 



367 ATCGJ^CCCTcACGTGCGGCrrCGCCGACCTCATCXXOTAC^^ 

367 ATCGATACCCTTACGTGCGGCTTCGCOyurCrCATGGGCTA^ 

367 ATCGATACCCTTACGTGCGGCTTCGCCGJ^CCrCATGGGGTACATJ^GCTCCT 

367 ATCGATACCCTOlCGTtkrGOClTCGCCGACCTCATGGOOT^ 

367 ATCGATACCCTCACGTGCGGCTrcGCCGACCTCATGGGCTACATTCCGCT^ 

367 ATCGATACCCTCACCntSCGGCTTCGCCGACCTCATGGGGTACATrcCGCT^ 

ATCGAtACCCTcACGTGCGGCrrCGCCGACCTOVTGGGGTACA^ 



108 
103 
104 
105 
106 
107 



DR4 

DK7 
USll 
S14 
SWl 
S18 



103-108 consensuB 



428 CcCTTGGgGGCGCTGCaiGGGCCCTGGCGCATGGamX^ 

428 CTCTTGGAGGCOCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTT^^ 

428 crCrCGGJdSGCGCIGCCJJSGGCCCrGGOXA^ 

428 CcCTCGGgGGCGCTGCCAOGGCCCTGGCGCATGGCGTCCGGGT^^ 

428 CTCTtGGAGGCGCTGCCAGGGCCCTGGCOCATGGOTTCCGGGTTC^^ 

428 CTCTcGGAGGCGCroCCMGGCCCTGGCGaiTGK 

CtCT -GGaGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGgGTTCTGGA^ 



SEQ ID NO: 

108 

103 

104 

105 

106 

107 



DR4 
DK7 
USll 
S14 
SWl 
S18 



103-108 consensus 



SEP II? W; 

108 
103 
104 
105 
106 
107 



DR4 

DK7 
USll 
S14 
SWl 
S18 



489 CrATGCAACAGGGAAt LV lXJLU Xj Gn GCnV A Vli r i CT AlX^nCl^^ 

489 CTATGCAACAGGGAACCTTCCTGGTIG ClVmVX'CTATLn'it:L^^^^ 

489 CTATGOUCAGGGAAOTTCCTGCTTOCiriTTCTCTATO 

489 CTATGCAACAGGGAACCri'LLiUiilLiLlLi'nLi'LiAXLXlCCTcCTaGCCt,^^ 

489 CTATGOJlC^GGGAACCTTCCIXKnTOCr^^ 

4 89 CTATGCAACAGGGAACCTTCCTGGTTCCTCTTrCrCTA^ 

CTATGCAACAGGGAACCTTCCTGGTTGCTCTITCTCTATC^ 



550 TGCtTGftCCGTGCCCGCaTCGGCC 
550 TGCCTGACCGTGCCCGCTTCGGCC 
550 TGCCTGACTGTGCCCGCrrCAGCC 
550 TGCCTGACTGTGCCCGCrrCAGCC 
550 TGCCTGACaGTGCCCGCGTCAGCC 
550 TGtCTGACtGTGCCCGCGTCAGCt 



103-108 consenBufi 



TGccTGACtGTGCCCGCtTCaGCC 



•4m. I 
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gEO ID NO: 

119 

117 

118 

111 

112 

113 

114 

115 

116 

122 

109 

110 

123 

124 

120 

121 



39 
IND3 
IKDB 
Dl 
US6 
PIO 
DXl 
TIO 
SW2 
KK4 
SAIO 
S45 
P8 
T3 
HK3 
HK5 



109-124 consenaus 



1 ATGAGCACGAATCCTAAACCTCJUUU5AAAAACCJUACGTAACACCAACCG 

1 ATCAGCAOyUVrCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCAC^ 

1 ATGAGCACGAATCCTAAACCTCAAAGAAftAACCAAACGTAACR C CAACCGC^^ 

1 ATGJUSCACGWlTCCTJUACCTCAAAaAAAAACCAAACtrrAAa^ 

1 ATGACCACGAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACA^ 

1 ATCAGCACGAATCCTJUU^CXXaUU«yUUUU^CAAACGTAAC^ 

1 ATGAGCWCGAATCCTiUUU:CTaUU;GAAAAACCAAACGTAACACCAA^ 

1 ATGAGCRCGAATCCTiUU^CT C AAAGAAAAACCAAACGTAACACCftACro 

1 ATGAGCRCGAATCCTAAACCTCAAAGAAiUU^aUUCCTAACACCAACCGCCGCCCA^^ 

1 ATCJUraiCGAATCCTAAACCTCAAWaUU^gAC^^ 

1 ATGAGCACGAATCCTAAACCTaUUUjAAAAACCAAACGTAAC^ 

1 ATGAGCACGAATCCTJUU^CKyUUUJAcAAACCWUU^^ 

1 ATGAGCACGACTCCTJUUlCCTCAAAGftAAAACCAAArGTAACACCAgCCGCCGCC 

1 ATQAGCACGAATCCTAAACCraUUtfSAAJUU^CAAArGTAAC^ 

1 ATGAGCJtfXSAATCCTAAACCTCAAAGAAAAACCWVACGTAACACCA^ 

1 ATtaiSCACGAATCCTAAACCTCAAAGAAAiJlCCAAArGTAACACCAACC^^ 

ATGAGCACCJAATCCTAAACCTCAAAGAaAaACCAAACGTAACACCAaCCCKrCGCCCA 



119 
117 
118 
111 
112 
113 
114 
115 
116 
122 
109 
110 
123 
124 
120 
121 



S9 
XHD3 
IND8 

Dl 

ns£ 

PIO 
DKl 

no 

SW2 
KK4 
SAIO 
S45 
P8 
T3 
HK3 
HK5 



A 



109- 


124 


consensus 


A 










119 




59 


123 


117 




IND3 


123 


lie 




IND8 


123 


111 




Dl 


123 


112 




US6 


123 


113 




PIO 


123 


114 




DKl 


123 


115 




TIO 


123 


116 




SW2 


123 


122 




HK4 


123 


109 




SAIO 


123 


110 




S45 


123 


123 




P8 


123 


124 




T3 


123 


120 




KK3 


123 


121 




HK5 


123 


109 


-124 


consensus 




SEC 


ID NO 


: ISOLATE 




119 




S9 


1S4 



62 ACGTtJJlCrrrCCCGGGCGGTGGtCAGATCGTcGGTGGAGm 
62 ACCnX:RACnTCCCGGGaxat;GCCAGAin,XylTGGTGG ft^^ 

62 ACGTCAAGTrCCCGGGCGGTGGCCAGAinjjTTGGTGGA GTITA C CTGTTG CCGCGCA^ 

62 ACGTCAAGTTCCCGGGCGGTGGTatfaVl^OTTGGT GG A CmTA ^ 

62 ACGTCAACTTCCOGGGCGGTGGTCAGAlXJUarCGGTGGA GTITA C CTGTTG CCGTO 

62 ACGTCAAGTrCCCGGGCGGTGGTCJUGATCtflTGGTGGA GTITA C CTGTI^ 

62 ACGTCAAGTTCCCGGGCGGTGGTCASATC G TT GG T G GA GTTIA C CTGTTG CCGCGC^^ 

62 ACGTCAAGrrCCCGGGCGGTGGTCAGATCC>nWit3GA GTTTA C CTG^^ 

62 ACGTCAAGTTCCCGGGCGGTGGCCAGATa»'nXKSTGGA GTrrA C CTtnTG C^ 

62 AOGTtJUVGTTCCCX«K!tCGGTGGCCAGATCGTcGGTGGACTTIACCT 

62 ACGTCAAGTreCCGGGCGGTGGTCAGATCtflTGGTGGA GTcT AtCTtnTC 

62 ACGTCAAGTTCCCGGGtGQcOGTCAG AlXXiriTO T G GJ UnTrA C CTOTTG CCGCGC^^ 

62 ACGTTJUtfrrrCCCGGGCGGrrGGTCAGATCGTltXmKaA g 

62 ACGTTJUtfTTTCCCGGGCGGTGGTCAGATCglTGGTGGJ UnTrA C CT^ 

62 ACGTCAAGTTCCCGGQCGGTGGTCAGATCGTI G GTGGA CTTrA C CTGTTC 

62 ACGTCAACnTCCCGGGCGGTGGTCAGATCGTTGGTGGAGTrrACCTtrrTGCOT 

ACGTcAAGTrCCCGGGcGOtGGtCAGATCGTtGGTGGAGTtTAcCTGTTGCCGCGCAGGGG 



GACITCCGACCOGTCGCAACCTCGTGGAAGG 



CCCCAGGTrGGGTtrrGCGCGCGACtAGGAAGACrniCGAGCGCTCC^^ 
CCCaUSGTTGGGTGTGCGCGCGACrAGGAAGACrrCCGWGCGGTCGC^ 



CCCCaGGTTGGGTtrrGCGCGCgACCJUKSAAGACrTCcGACCGgTCgCA^ 



CGACAACCTATCCCCAAGGCTCGCCatCCCGAGGGcAGGGCCTGGGCTCAGCCCGGGTACC 



MR.I 



wo 96/05315 



PCT/US95/10398 



51/89 



PZGDU 6B 



H7 




IHD3 


X18 






111 




Dl 


112 




as6 


113 




PIO 


114 




DKl 


115 




TIO 


116 




SW2 


122 




HX4 


109 




SAIO 


110 




S45 


123 




P8 


124 




T3 


120 




HK3 


121 




HK5 


109- 


124 


consensus 


§EQ 


IP W; 


ISOIATE 


119 




S9 


117 




I11D3 


118 




im)8 


111 




Dl 


112 




US6 


113 




pib 


114 




DKl 


115 




TIO 


116 




SW2 


122 




KK4 


109 




SAIO 


110 




S4S 


123 




P8 


124 




T3 


120 




KK3 



121 



109-124 



HK5 



consensus 



184 CGAOU^CTATCCCCJUGGCTCGCCGGCCCGAGGGTJUSGGCCTCGGCT^ 

184 CGJUaUU:CTATCCCaUU3GCrCGCCGGCCCGAG0Cn»GGGCC^^ 

184 CGACAACCTATCCCCAAGGCTCGCCOaCCCGAGGGTAGGGCCTGGGCTC^ 

184 CGACAACCTATCCCCAAGGCTCGCCGGCCCGAGGGCACKSGCCTOGGCT^ 

184 CGAOIACCTATCCCCAAGGCICGCCGGCCCGAGGGCAGGGCCTGGGCTCA^ 

184 CGAOJUrCTATCCCCaUUXSCTCGCCGGCCCQAGGGaVGGaCCT^ 

184 CGACAgCCTATCCCCAAGGCTCGCCAGCCCaW«XSC3U3GGCC^^ 

184 CGACAACCTATCCCCAAGCJCTCGCCJUjCCCGAGGGCAGGGCCTGGGCTCAG^^ 

184 CGAOUVCCTATCCCCAAGGCTCGCaUCCCGAGGGCAGGACCIGGC^^ 

184 CGAOU^CTATCCCCAAGGCrCGCCAGCCCGAGGGCAGGACCTGGGCCCAGCCCG^^ 

184 CGACAACCTATCCCCAAGGCrCGCCGGCCCGAGGGCJUXX^CCTtXX^ 

184 COACAACCTATCCCCAAGGCTCGCCGGCCCGAGGCrrAGCX^CCTGGGCTC^ 

184 CGAOU^CTATCCCCAAGGCTCGCCGGCCCGJUXiGTAGGGCCITKK^CTCAQC^ 

184 CGACAACCTATCCCCAAGGCTCGCCaACCCGAGGGCAGaACCTGGGCTC^ 

184 CGACAACCTATCCCCAAGGCTCQCCgACCCGAGGGCAGGACCIGGGCTCAGCCCGC^^ 

CGACAaCCTATCCCCAAGGCTCGCCggCCCGAGGGcAGGgCCTGGGCtCAGCCcGGGtAcC 



245 CrnKSCCCCTCTAcGGOUVTCAGGGCrrGGGCnxX^ 

245 CTTOGCCCCTCTATGGCAATGAGGGCTTGGGGTGGGCAGGATGG 

245 CTTCreCCCCTCTATGGCAATGAGGQCTTGGGGTGGGCAGGATGGC^ 

245 CTTGGCCCCTCTATGQCAACGJU^GGCTTGGGGTGGGCAGGAIXKX^ 

245 CTTGGCCCCTCTATCGCaUlCGAEyx;CaTGGGGTQGGCACX» 

245 CTTGGCCCCTCTATGGCAATGaOGGCtTCKKKnGGGCAGGATtX;^^ 

245 CmGGCCCCTCTATGGCAATCAGGGCATGGGGTGGGCAG^ 

245 CrTGGCCCCTCTATtKKJUmSAGGGCATQGGGTGGGCAGGATGGC^ 

245 CCTGGCCCCTCIATGGCAATGAGGGCATGGGaTGGGCACKSATGGCTC 

245 CrrOGCCCCTCTATGGCAATOAGGGOlTGGGCntXKKa^^ 

245 CTTGGCCCCTCTATGGCAATGAGGGCTTCKKXrroGGCAGGATGGC^ 

245 CTTGGCCCCTCTATTCCAATCACXXXOTGGGGTCXK3CAGGATOT 

245 CTTGGCCCCTCTATGcCAATGAGQGCTTGGGXrrGGGCgGGATGGCTCC^^ 

245 CrrGGCCCCTCTATGGCgACGAGGGCATGGGGTGGGCaGGATGCCrCC^^ 

245 COTGGCCCCTCTATGGCAACGAGGGC^TGGGGTGGGaUXyVT^^ 

245 CTIXXjCCCCTCTATGGCAAtGAGWGCATGGCGTGGGCACXyaXX; 

r\ t\ 

C tTGGCCCCTCTAtGgCaAt GAGGGC - TGGOgTGGGCaGGATGGCTCCTGTCaCCGCgcGG 



SEP ID NO: 

119 

117 

118 

111 

112 

113 

114 

115 

116 

122 

109 

110 

123 

124 

120 

121 



S9 
IHD3 
IND8 

Dl 
US 6 
PIO 
DKl 
TIO 
SW2 
HK4 
SAIO 
345 

P8 

T3 
HK3 
HK5 



109-124 consensus 



306 cTCTCGGCCTAGTTGGGGCCCCAatGACCCCCGOCGTAOGTCGCGTA ATITGG GTAAgGTC 

306 tTCTayKrCT A GlT GG GGCCCCACAGACCCCCGGCGTAGGTCG CGT A ATlTC GGTAAaG^ 

306 CTCraSGCCrAGTTCKSGGCCCCACAGaCCCCCGGCGTAGGTCGCC^ 

306 CTCCCGGCCTAGTroGGGCCCCACcGACCCCCGGCOTAGGTCGCGTAATTOGGW 

306 CTCCCGGCCTAGTTGGGGCCCCACGGJ^CCCGGCGTAGGTCGCCTA ATITG GCTAAG^^ 

306 CTCTCGGCCTAgTTGGGGCCCCACGGACCCCCGGCGTAGGTCGCGTR ATTTG GGT^^ 

306 CTCcCGTCCTACnrrGGGGCCCC^^ 

306 CTCTCGQCCTAGTIXXX^GCCCCACtGACCCCCSGOCn'AQQTCGCGTA A 
306 CTCTCGGCCTAGTTGGGGCCCaUXK»CCCCCGGCCTAGGTCGCGcAA^^ 
306 C rxrrC GG CCrABritXXSGCCCCACGGACCCCCO^^ 

306 CTCCC GG CC'l TU nT G GGGCCCCACGGACCCCOOGCGTAGGTCGCGCA ATTTG GGTAA^^ 
306 CTCCCGGCCTACnTGGGGCCCCACGGACCCCCGGCGTAGGTCGCGCAATITGC^ 
306 CTCCCQGCCTAATIXKXy^CCCCACaGACCCCCGGCGTAGGTCGCGtA ATeTG GGT^ 
306 CTCTCGGCCTAATTGQGGCCCCACGGACCCCCGGOnAGGTCGCGcA ATTrGG GT 
306 CTCTCGGCCrAgTITCGGCCCC3lCGGACCCCCGGCCrrAGGTCX3CGtAAT^^ 

cTCtCGGCCTAgTTGGGGCCCCAcgGACCCCCGGCGTAGGTCGCGtAATtTGGGTAAgGTC 



SSQ IP yg; 
lis 

117 
118 



S9 
I17D3 
IND8 



367 ATCGATACCCTCACATGCGGCTrtGCCGACCTCATGGGGTACATtCCGCTCCrrCGGCGCCC 
367 ATCGATACCCTCAGATGCGGCTTCGCCGACCTCATGCGGTACATCCCGCTCGTCGGCGCCC 
367 ATCGATACCCTCACATGCGGCTTCGCCGACCTCATGGGGTACATCCCGCTCGTCGGCGCCC 



•4lff2_l 



wo 96/05315 PCr/DS95/10398 



52/89 



FZCmU 6B 



111 


Dl 


367 


112 


US£ 


367 


113 


PIO 


367 


114 


DKl 


367 


115 


TIC 


367 


116 


SW2 


367 


122 


HK4 


367 


109 


SAIO 


367 


110 


S45 


367 


123 


P8 


367 


124 


T3 


367 


120 


HK3 


367 


121 


KK5 


367 


109-124 


consensus 





ATCGATACCCTOlCaTGCXXKnTcGCCCyiCCTCATGGGGTACATtCCGC^^ 



119 
117 
118 
111 
112 
113 
114 
115 
116 
122 
109 
110 
123 
124 
120 
121 

109-124 



S9 
IND3 
IND8 
Dl 
US6 
PIO 
DKl 
TIO 
SW2 
KK4 
SAIO 
S4S 
P8 
T3 
HK3 
HK5 



consensus 



SEP ID WQ: 

119 

117 

118 

111 

112 

113 

114 

lis 

116 
122 
109 
110 
123 
124 
120 
121 



S9 
IND3 
IND8 
01 
0S6 
PIO 
OKI 
TIO 
SW2 
KK4 
SAIO 
S45 
P8 
. T3 
HK3 
HKS 



109-124 consensus 



428 CCCrAGGGGGCGCTGCCAGGGCtCTOGCGCATOGCGTCCGGGTtCTXySAG^ 

428 CCCTAGGGGGCGCTt3CCAGGGCCCTGGCGCATGGCGTCCGGGTCCKX»GGAra 

428 CCCTAGGGGGTGCTGCCAGGCXrCCTGGCGCATGGCGTCCGGGTOT 

428 CCCTAGGGGGTGCTGCCAGGGCCCTGGCGCATGGCQTCCGGGTTCTGGAGGACGGC^ 

428 CCCTAGGGGGCGCTGCOUKJGCCtTGGCGCATGGCGTCCGGGTO 

428 CCCTAGCXXWCGCTGCCASGGCCCTGGCCCATOGCGTCCOGGm 

428 CCCTAGGGGGCGCTGCCAtK»CCCTG0CGCATGaCGTCCC»3GTO 

428 CCCTASGGGGCGCTGCCAGGGCtCTGGCaCATGGtGTCCGGGTTCT^^ 

428 CCCTAOGGGGCGCTGCCftGGGCCCTGGCgCATOQcGTCCGGGTcCTGGA^ 

428 CCTIAGGGGGCGtTGCCJUSaGCCCTGGCaCATGGtGTCCGGGTTgTGGAC^^ 

428 CtTrAGGGGGCGCTGCaWSgGCCTTGGCGCATGGCGTCCGGGTTCrGG^ 

428 CCCTAGGGGGCGCTGCCaVBaGCCTTGOroC^^ 

428 CCtTAGGGGGCGTTGCaUXXSCCCTGGCGCATGGCGTC 

428 CCCTAGGGGGCGTTGCCAGAGCCtTOGCRCATGGTGTCCGG GTTCTG GASGACGG CCT 
428 CCCTAGGGGGCGTTGCCAGAGCCc1XX3CAaicGCrrCTCC^ 

CcCTAGGGGGcGcTGCCJ^gGCCCTGGCgCAtGGcGTCCGGGTtCTGGAgGACGGC^^ 

4 89 CTATGCRACAGGGAACcTcCCt^jGTltX."ltrri'lViVl^^^ 
489 CTATGCAACAGGGAJUrrTGCCCXK»TiU.Tt'mLiLlATa^^^ 

489 CTATGCAACASGGAACTTGCCCGU'ri'U-'lt i 1 1 CTCTATCTTCCTTTTGG l,! x x\»CTATC C 

489 tlATCCAACAGGGAAtTTGCCCQ O ' nUUXLl 1 i CTCTATCTTC CTCTTG G ClTrGCTG TCC 

489 CTAltKJU^CAGGGAACTTGC CCUU ' rA ' U .'XLi 1 ILl CrATCnVClVnXjGCTTrG CrGTC C 

489 CTATGCAACAGGGAATcTGCCCGGyAtiCTLTllLiLiAiVi lL LiLUl» GCTl^^^ 

4 89 CTAcGCAACAGGGAATrTQCC ^^tf iTGCl VmVlLlA T^^ 

489 CIATQCAACAGGGAATTTGCCLUiliUVL'ACi'i'riTCT ATC^^ 

489 CTATGCAACAGGGAATCTGCCLX i Gl'iqcr CCTTITL TA lVri^- Cr ^^ 

4 89 CTATSCaACAGGGAATTTGCCOGGriUC'lLrntiClAlVnCC^^ 

4 89 CTATGCAilCAGGGAATITGCCCGGTrGCc Cmr X'CT 

489 CTAJGCAACAGGGftATCTGCCCGOVltiLUliri^l^ 

489 CTJmaCAACftGmAftTCTGC CIXJGlTCCTLTr ^ 

489 tTAcGCAACAGGGAATTTng tritJUl ' raCitJri TCTC T^ llGG CTTTGCTgTCC 
489 CTAtGCAACAGGGAArrTACCCG U ' nWi ' L ' i ' i ILXC T A TCTrCCTOTGQCTTOCTGTCC 
4 89 CTAcGCAACAGGGRATaTACCCG Ur i tiCU t,'XU'XC X CT A TCrTCC^ 

cTAtGCAWyUSGGAAttTgCCcGGTTGCtCtTTcTCTATCrrCCT^ 



SEP ID NO: 

119 

117 

118 

111 

112 



ISOIATE 
S9 
ZIID3 
IND8 
Dl 

as 6 



550 TGTTTGACCATCCCAGCTTCCGCr 
550 TOnTGACCATCCCAGCXTCCGCT 
550 TGTTrGACCgTCCCAGCTTCCGCr 
550 TGTTTGACCATCCCAGCTTCCGCT 
550 TGTTTGACCATtCCAGCTrCCGCT 



wo 96/05315 



PCTAJS95/10398 



53/B9 



PZGDIUE 6B 



113 


PIO 


114 


DKl 


115 


TIO 


116 


SV2 


122 


HK4 


109 


SAIO 


110 


S45 


123 


P8 


124 


T3 


120 


HK3 


121 


HK5 


109-124 


consensus 



550 TGcCTGACCATCCCAGCgTCCGCT 
550 TGTtTGACCATCCCAGCTTCCGCc 
550 TCSTCTGACCATCCCAGCTTCCGCT 
550 TGTCTGACCATCCCAGCTrCCGCT 
550 TtnTrOACCATCCCAGCTTCCGCT 
550 TGTTTaACCATCCCAGCCTCCGCr 
550 TGCTTGACCATCCCAGCTTCCGCT 
550 TGtcTGACCATCCCAaCTTCCGCT 
550 TGCTTGACCATCCCAGCTTCCGCT 
550 TGCtTGACCACCCCACCTTCCGCT 
550 TGtcOTACCACCCCAGtTTCCGCT 

TGttTgACCatcCCAGctTCCGCt 



wo 96/05315 



PCT/US95/10398 



FZOtnUB 6C 



?E0 ID NQ; 


ISOIATE 




119 


S9 


1 


117 


IND3 


1 


118 


iin>8 


1 


111 


Dl 


1 


112 


use 


1 


113 


PIO 


1 


114 


DKl 


1 


115 


TIO 


1 


116 


SW2 


1 


122 


HK4 


1 


109 


5A10 


1 


110 


S45 


1 


123 


P8 


1 


124 


T3 


1 


120 


HK3 


1 


121 


HK5 


1 


108 


DR4 


1 


104 


USll 


1 


105 


S14 


1 


106 


SWl 


1 


107 


S18 


1 


103 


DK7 


1 


103-124 


consensus 





ATGAGCACGRATCGTJUUlCCTCAAAGAAAAACaUftCGTAACACa^ 
ATGAGCACGAATCCTAAACCrCAAAGAAAAACCAAACGrrAAOWrCW 
ATGAGCACGAATCCT3UUVCCTCAAAGJUUUA C CA3VACGT^ 
ATGAGOVCGAATCCTAAACCXXaUlAGAAAAACCAAACCT 

ATGAGCACGAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCftACCGCCGCCCAC^ 

ATCAGCACGAATCCTAAACCTCAAAGAAAAACCAAACGTAAOUXAACCGCCGCCC^^ 

ATGAGCACGAATCCTAAACCTCAAftGAAAAACCAAACGTAACACCAACCGCCGC 

ATGAGCACGAATCCTAAACCTCAAAGAAAAA C CAAACGTAACACCAACCGCCGCCCACATC 

ATGAGGACGAATCCTAAACCTCAAAGAAAAACCAAACCrrAACACCAACTC^ 

ATGAGCACGAATCCTAAACCICAAAGAAAgACCAAACGTAACACaUCCGCCGCC 

ATGAGCRn^TrrrftJAf'* ^^aBnailfta ftCCAAACGTAACJ^CAA^ 

ATGAGCAOSAATCCrrAAACCTCAAAGAcAAACCAAACGTAACACCAACC^ 

ATGAGCACQAtfTCCIAAACCTCAAAGAAAAACCAAACGTAACACCAgCCGCCGCCCACft^ 

AroAgCACGAATCrTA^gCTg^^A&GJJUL^ ^ 

ATGAGCACGAATCCTMtf^CTCAAAGAAAAACCAAACQTAACACCAA C CGCCG^ 

ATGJtfSCACGAATCCTAAACCTCAAAQAAAAACCJU^ACGTAACACCAA^ 

ATGASCACGAATCCTAAACCTCAAAGAAAAACCAAACOTAAa^CAAC CCgro ^ 

ATGAGCACGAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAAC CGTO C^ 

ATCAGCACGAATCCTAAACCTCAAAGAAAAACaUU^GTAAC^ 

ATCAGCACGAATCCTAAACCXrAAABAAAAACCAAACGTAACACCAACCGTC^ 

ATOAGCACoAATCCTAAACCXCAAAGAAAAACCAAACGTAACACCAIlCCCTCGCCCAC^ 

ATGAGCACgAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGTCGCCCACA^ 

ATGJUGCACgAaTCCTAAACCTCAAAGAaAaACCAAACGTAACACCAaCCGcC^ 



5TO 


ID NO: 


ISOLATE 


119 




S9 


117 




IND3 


118 




IND8 


111 




Dl 


112 




OS6 


113 




PIO 


114 




DKl 


115 




TIO 


116 




SW2 


122 


HX4 


109 




SAIO 


110 




S45 


123 




P8 


124 




T3 


120 




HK3 


121 




HK5 


108 




DR4 


104 




USll 


105 




S14 


106 




SWl 


107 




S18 


103 




DK7 


103 


-124 


consensus 



62 ACGTtAAGTTCCCGGGCGGTGGtCAGATCGTcGGTGGJ^GlTTACCTC 

62 ACGTCAAGTTCCCGGGCGGTGGCCftGATCb"rrt3GTGGA GTITA C CTGTTG CCGCGCAf^^ 

62 ACGTCAAGTTCCCGGGCGGTGGCCAGATLXrritXriTXU tfnTIA C CTGTro 

62 AC<nOUtfriTCCCGGGCGGTGGTCAGAlUfrXttGTt«»ft g 

62 ACGTCAAGTTCCCGGGCGGTGGTCAGATLVriTXSTGGA GTTT AC C^ 

62 ACGTCAAGTTCCCGGGCGGTGGTCAGATCVntXjTTCA GTTrA C CTG^ 

62 ACGTCAAGrrCCCGGGCGGTGGTCftGATCCiVAXjGTGGA GTITA C CTGTItS CC^ 

62 ACGTaUUnrCCCGGGCGGTGGTCAGATLXrntjGTGGA glTTA C C^ 

62 ACGTaUtfnTCCCGGGCGGTGGCCAGATWntSGTGGA GTrTA C CTGT^ 

62 ACGTtAAGTTCCCGGGCGGTGGCCitfSATCGTcGGTGGAGTrTAC CTGr 

62 ACGTCAAGTTCCCGQGCGGTGGTCAGAltXirA'GGTGQA GTgT Ac CTGT^ 

62 ACGTCAAGTTCCCGGGtGGcGGTCACiATCGlTGGTCGA STTTA C CTC^^ 

62 ACGTTAAGTrCCCGGGCQGTGGTCASAlt:b"X'iX;UltjGA CTTTAanxyr^^ 

62 ACGTTAAGTTCCCGGGCGGTGGTCAG Al tX S TTGGTGGA CmTA C CTGTrG CCGCGC^^ 

62 ACGTCAAGTTCCCGGGCGGTGQTCAGAl^XjTiXiGltSaJl GTTTA C CTG 

62 ACGTCAAGTTCCCGGGCGGTGGTCAGAltXrritKSTGGA CnrrTACC^^ 

62 AamJtftfm^CCGGGTGGCGGTCAG A lXXyiT GG T G GA GTTTACTTGT^ 

62 ACGTCAAGTrCCCGGGTGGCGgTCftGATC & TT GG T G GA GTITACTItnTG CC^^ 

62 ACGTCAAGTrCCCGGGTGGCGGTOtfamXfniSGlXyS agr^^ 

62 ACGTCAAGTTCCCGGGTGGCGGTCAGAlt^lTGGTGGA GTTIACTTCTrG CCG 

62 ACCrrtAAGTTCCCTXyrrGGCGGTCAGAT LVn t a GT G GA g 

62 ACGTcAAgTreCCGGGTGGCGGTCAG AatXa Tr GG TGGAGTTrALTi'b nOCCGCGCAGGGG 
ACGTcAAGrrCCCGGGcGGtGGtCAGATCGTtGGTGGAGTtTAcCTGTTGCCGCGCAG 



SEO TP W 






119 


S9 


123 


117 


IND3 


123 


118 


IND8 


123 


111 


Dl 


123 


112 


US6 


123 


113 


PIO 


123 


114 


DKl 


123 


115 


TIO 


123 


116 


SW2 


123 


122 


HK4 


123 



CGCGCGACTAGGAAQACTTCCGAGCGGTCGCAACCTCGTGGAAGG 



1093.1 



wo 96/05315 



PCTAJS95/10398 



55/89 



109 




SAIO 


110 




S45 


123 




P8 


124 




T3 


120 




H1C3 


121 




HKS 


108 




DR4 


104 




USll 


105 




S14 


106 




SWl 


107 




S18 


103 




DK7 


103- 


124 


consensuB 


)5S0 ,, 


ID NO: 


: ISOLATE 


119 




S9 


117 




IND3 


118 




IND8 


111 




Dl 


112 




ns6 


113 




PIO 


114 




DKl 


115 




TIO 


116 




SW2 


122 




HK4 


109 




SAIO 



PZGOM 6C 

123 CCCCAGGTrGGGTGTGCGCGCGACgAGGAAGACTTCCGAGCCOTCGa^ 

123 CCCCftGGTroGGTGTGCGCGCGACTJUyaAAGACrrCCGAGCGGTCaCAACCT 

123 CCCCAGGTTGGGTCrroCGCGCGACTAGGAAGACTTCCGUlQCGATCGCJUlCCT 

123 CCCCASGTTGGGTCnt^CGCGCGACTAK^AGACTTCCGJlGCGGTC 

123 CCCCAGGTTGGGTGTGCGCGCGACCAGOAAGACTTCaGAGCGGTCGCJUUrCTCGT^ 

123 CCCCAGCTTTGGGTTmSCGCGCGACCAGGAAGACTTCCGAGCGGTCGCAAC^^ 

123 CCCTAGATrGGGTCjIGCGCGCGACGAOGAAGACTTCCGJUjCGGTCGCiUlCCTC 

123 CCCTAGATTGGGTGTGCGCGCGACGAGGAiUIACTTCCGAGCGGTCGCAACC^ 

123 CCCTAGATTGGGTGTGCGCGCGACSAGGAAGACTTCCGAGC^ 

123 CCCTAGATTGGGTGTGCGCGCGACGJU^GAAGACTTCCGAGCC^ 

123 CCCTAGftTTGGGTGTGCGCGCGACGJlGGAAGACTTCCGJlSCOGTCGC^ 

123 CCCTAGATIGGGTGTGCGCGCOcCGJUX^AAGACrrcCGJUSCGGTCGa^ 

CCCcaGgTItKKrrGTGCGCGCgaCtACXjAAGACTTCcGAGCGgTC 



184 CGACAACCTATCCCCAAGGCTCGCCatCCCGJUK^GcAGGGCCTGGGCTCAGCCCGGGTACC 

184 CGACAACCTATCCCCAAGGCTCGCCGGCCCGAGGGTACKXSCCItKXSCTCAGCCC 

184 CGWJUlCCrATCCCCaAGGCTCGCCaaCCCGAGGGTRGGGCCr^ 

134 CGACAACCTATCCCCAAGGCTCGCCGGCCCGAGGGTAGGGCCTGGGCTCAGCCCG^ 

184 CGACAACCTATCCCCAAGGCTCGCCGGCCCGAGGGCAGGGCCTGGGCTC^ 

184 CGACAACCTATCCCCAAGGCTCGCCGGCCCCSAGGGCAGGGCCTGGGCrCA^ 

184 CGAauUXTATCCCCAAGGCTCGCCGGCCCGAGGGCAGGGCCTGGGC^^ 

184 CGACAgCCTATCCCCAAGGCTCGCaUSCCCGAGGGCASGGCCTGGGCT^^ 

184 CGACAACCIATCCCCAAQGCTOKrCAGCCCGAGGGaiGGGCCTGGGCT^ 

184 CGACAACCTATCCCCAAGGCTCGCCAaCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTftCC 

184 CGACAACCTATCCCCAAGGCTCGCCAGCCCGAGGGCAGGACCTGGTC 

110 S45 184 CGACAACCTATCCCaU^CTCGCCGGCCCGAGGGCAGGGCCTCGGCCCA^ 

123 P8 184 CGACAACCTATCCCCAAGGCTCGCCGGCCCGAGGGTAGGGCCXtKM 

124 T3 184 CGACAACCTATCCCCAACGCTCGCaK;CCCGKXXrrAGGGCCT^^ 
120 HK3 184 CGACAACCTATCCCCAAGGCTCGCCaACCOSAGGGCftSGACCT^ 

184 CGACAACCTATCCCCAAGGCTCGCCGACCCGAGGGCAGGJ^CTGGGCrCAGCCCGGGTO 

184 CGTOUSCCTATCCCCAAGGCgCGTCGGCCCGAGGGCAGGACC^^ 

184 CGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCACXjACCTGGGC^^ 

184 CGTOUSCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGACCTGCGCTCAGre^ 

a84 CGTCAGCCTATCCCCAAGGCGCGTCGGCCCGAGGGCAOGACCTGGGCTC^ 
184 CGTCABCCTATCCCCAAGGCGCGTCGGCCCCSAGGGCAGGACCTGGGCTCAGCCCGGGTACC 
184 CGTCAGCCTATCCCCAAGGCaCGTCGGCCCGAGGGCAGGACCTGGGCTC^ 

CGaCAaCCTATCCCCAAGGCtCGcCggCCCGAGGGcJUX5gCCTGGGCtCAGCCcGGQt^ 



121 




HKS 


108 




DR4 


104 




USll 


105 




S14 


106 




SWl 


107 




318 


103 




DK7 


103- 


124 


consensus 




ID NO 


: ISOIATE 


119 




S9 


117 




IND3 


118 




IND8 


111 




Dl 


112 




US 6 


113 




PIO 


114 




DKl 


115 




TIO 


116 




SW2 


122 




HK4 


109 




SAIO 


110 




S45 


123 




pa 


124 




T3 


120 




HK3 


121 




HKS 


108 




DR4 


104 




USll 


105 




S14 


106 




SWl 


107 




S18 



245 CTTGGCCCCTCTAcGGCAATGAGGGCTItWGGTGGGCAGGATGGCTCCTGTC^ 

245 CTTGGCCCCTCTATGGCAATGAGGGCTTGGGGTGGGCAGGATGGCTCCTGTCAC 

245 CTTGOCCCCTCTATGGCAATGAGGGCTIGGGGTGGGCftSGATGGCTC CTGTC ACCCCGC^ 

245 CTTGGCCCCTCTAitKSCAACGAGGGCrrOGGGTGGGCAGGATGGCTCCTGTCAC 

245 CTTGGCCCCTCTATGGCARCGAGGGCaTGGGGTGGGCAGGATGGCTC CTGTO ^ 

245 CTTGGCCCCTCTATGGCAATGAGGGCtTGGGGTGGGCAGGATGGCTC CTGTC ACCCCG^ 

245 CTTGGCCCCTCTATGGCAATGAGGGCATGGGGTGGGCAGGATGGCrCCTC 

245 CTTGGCCCCTCTATGGCAATGAGGGCATGGGGTGGGCJtfXSATGGCTC^^ 

245 CcTGGCCCCTCTATGGCAATOAGGGCATGQGaTGGGCAGGATGGCTCCTCT 

245 CTTGGCCCCTCTATGGCAATGAGGGCATGGGGTGGGCAGGATGGCTCCTCT 

245 CTTGGCCCCTCTATGGCAATGAOGOCTTOGGaTGGGCAGGATGGCrccra^ 

245 CTTGGCCCCTCTATGGCatflTGAGGGCrrGGGGTGGGCASGATGGCTC CTO 

245 crTGGCCCCTCTATGcCAATGAGGGCriXKKXrrGGGCgGGATGGCTOT 

245 CTTGGCCCCTCIATGGCgACGAOGGCATGGGGTGGGCAGGATGGCTCCTCT 

245 CTTGGCCCCTCTATGGCAACGAGGGCATGGGGTGGGCAGGATGGCTC CTCT 

245 CTrGGCCCCTCTATGGCAATGAGGGCATGGGGTGGGCACX»TGGCTCCTCT 

245 CTTGGCCCCTCTATGGOUlTGAGGGCTGCGGGTGGGCGGGATGGCTC CTGT CcCCC 

245 CTTGGCCCCICTATGGCAATGAGGGCTGCGOGTGQQCGQQATGGCTC CTGTCTC 

245 CTTGGCCCCTCTATGGCAATGA5GGCTGCGGGTGGGCGGGATGGCin: CTG'TC 

245 CTTGGCCCCTCTATGGCAATGAGGGCTGCGGaTGGGCGGGATGGCrC CrGTC 

245 CTTGGCCCCTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCCCCCCGTGG 



wo 96/05315 PCT/US95/10398 



56/89 



103 DK7 
103-124 consensus 



PIGORl 6C 

245 CTITCCCCCTCTRTGGOUITGAGGGCTGCGGGTGGGCGGGAT^ 

CtTGGCCCCTCTAtGgCaAtGAGGGCttgGGcfTGGGCaGGATGGCTCCTGTCaCCCCgtGG 



SEO ID NO: 


ISOIATE 


119 


S9 


117 


ZND3 


118 


IND8 


111 


Dl 


112 


US 6 


113 


PIO 


114 


DKl 


115 


TIO 


116 


SW2 


122 


KK4 


109 


SAIO 


110 


S45 


123 


P8 


124 


T3 


120 


HK3 


121 


HK5 


108 


DR4 


104 


USXl 


105 


S14 


106 


SWl 


107 


S18 


103 


DK7 


103-124 


consensus 


SEO ID NO; 


ISOLATE 


119 


S9 


117 


IND3 


118 


IND8 


A 111 


Dl 


112 


US 6 


113 


PIO 


114 


DKl 


115 


TIO 


116 


SW2 


122 


HK4 


109 


SAIO 


110 


S45 


123 


P8 


124 


T3 


120 


HK3 


121 


HK5 


108 


DR4 


104 


OSll 


105 


S14 


106 


SWl 


107 


S18 


103 


DK7 


103-124 


consensus 




119 


S9 


117 


IND3 


118 


IND8 


111 


Dl 


112 


US6 



306 cTCTCGGCCTAGTTGGGGCCCCAatGACCCCCGGCGTAGGTCGCGTA ATrrG GGTAAgGTC 

306 tTCrCGGCCTAGTTCGGGCCCCACJUSACCCCCGGCGTAGGTCGCGTA ATTTG GGTA 

306 CTCTCGGCCTAGTTGGGGCCCCACAGACCCCCGGCGTAG GTCG CGTA ATTTG GOT 

306 CTCCCGGCCTAGTTGGGGCCCCACcGACCCCCGGCCrraGGTCCCGTAATO 

306 CTCCCGGCCTAGTrOGQGCCCCACGGACCCCCGGCGTAGGTCGCGTA ATr^^ 

306 CTCTCGGCCTAGTTGGGGCCCCACXyaACCCCCGGCCnACGTOGCGTA ATr^^ 

306 CTCrCGGCCTAGTTGGGGCCCCAacGACCCCCGGCGTAGGTCGCGTAATTIt;^^ 

306 CTCcCGGCCTAGTTOGGGCCCCACaGACCCCCGGCGTAGGTCGCGTA^ 

306 CTCTCGGCCTAGTTCKXK^CCCCACtGACCCCCGGCCnAGGTa^CC^ 

306 CTCTCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGTAGCnTOCGcA ATrrG GGTAASGTC 

306 CTCTCGGCCTAGTItXKKSCCCaiCGGACCCCCGGCGTAGGTCGCQtAAm 

306 CTCCCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGTAOOTCGCGCAATmGC^ 

306 CTCCCGGCCTAGTTCXXKSCCCCACGGACCCCCGGCGTAGGTCGCGCAATITGGGT 

306 CTCCCGGCCTAATTCGCK^CCCCAC&GACCCCCGGCCTAGGTCGCGtA ATcTGG GTA 

306 CTCTCGGCCTAATTGGGGCCCaiCGCSACCCCCGGCGTAGGTCGCGcAAW 

306 CTCTCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGXaGgrCGCGtAA TTTG G^ 

306 CTCTCGGCCTAGCTGGGGCCCCACaGACCCCCGGCGTAGGTCGCGCA ATTTG GGTAA GGTC 

306 CTCTCGGCCTAGCTGGGGCCCCACgGACCCCCGGCGTAGGTOTOCA ATTrG GGTAAGCnr 

306 CTCTCGGCCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAAOTGGCT 

306 CTCTCGGCCTAQCTGGGGCCCTAauCSACCCCCGOCGnCAG<nCGro 

306 CrCcCGGCCTAQCTGQGGCCCTACAGACCCCCGGCGTAGGTCGCGCA ArrrG GGcAAACOT 
306 CTCtCGGCCTAGCTGGGGCCCcACAGACCCCCGGCGcAGGTCGCGCAATITGGGtAAAGTC 

CTCtCGGCCrAgtTGGGGCCCcAC-GACCCCaSGCGtAGGTOSCGtJATtTGGGtAAgG^ 



367 ATCGATACCCTCAaVTXKrGGCTTtGCCGACCTCATGGGGTACATtCCGCTOT 

367 ATOSATACCCrCACATGCGGCTTCGCCGACCTCATGGGGTACATCCCGCTCGTCGGCG^ 

367 ATC^TACCCTCACATGCGGCTTCGCCGACCTCATGGGCroiCJlTCC^ 

367 ATCGATACCCTOUJlTGCOGCTrCGCCGACCTCATGGGGTACATCCCGCTCGTCGC^^ 

367 ATCGATACCCTCACATlSCGGCTTCGCrGACC^ 

367 ATCXaiTACCCTCACATGCGGCITCGCCGACCTCATGGGGTACATTCCGCTC^^ 

367 ATCGATACCCTCACATGCGGCrrCGCCGACCTOVrTCGGTA« 

367 ATCGATACCCTCACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTOST^ 

367 ATCGATACCCTCACATGCGGCTrCGCCGACCTCATGGGGTACAritrGCra^ 

367 ATCGATACCCrCACATGCGGCTTCGCCGACCTCA3TKKX3TACArrCa3C^^ 

367 ATCGATACCCTCACATGCGGCTTCGCCGACCTCATOGGGTACATTCCGCTCGTCGGCGCCC 

367 ATCGATACCCTOUIgTGCGGCTTCGCCGACCTCATGGGgTRCArrCCGCTCCT 

367 ATCGATACCCTCAOlTGCGGCTTaSCCGACCTCATGOGGTACATrCCGCTC^^ 

367 ATCGATACCCTCACATGCGGCrrCGCCGACCTCATGGGGTACATTCCGClTO 

367 ATCGATACCCTCACGTGCGGCTrCGCCGACCrCATGGGGTACATCCCGCT 

367 ATCGATACCCTCACGTGCGGCTTXrGCCGACCTCATOGGGTACATCCCGCrroT^ 

367 ATCGAcACCCTCACGTGCGGCTTCGCCGACCTCATGGGGTACATCCCGCTO 

367 ATCGATACCCTtACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCC^^ 

367 ATCGATACCCTCACXrrGCGGCTrCGCCGACCTCATGGQGTACATAC^ 

367 ATCGATACCCTGACGTGCGGCtrCGCCGACCTCATGGGGTACATTCOT 

367 ATCGATACCCIWCQTGCOGCnrrCGCCGACCTCATGGGGTA» 

367 ATOSATACCCTtACGTGCGGOTCGCCGACCrCATGGGGTACATaCaSCTC^^ 

ATCGAtACCCTcACaTGCGGCTTcGCCGACCTCATGGGGTACATtCCG^ 



428 CCCTAGGGGGCGCTGCCAGGGCtCTGGCGCATGGCGTCCGGGTtCTGGAGGACGG CGTG AA 
428 CCCTAGGGGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTCCTGGAGGACGGOT 
428 CCCTAGGGGGTGCTGCCACGGCCCTGGCGCATGGCOTCCGGGTCCT^^ 
428 CCCTAGGGGGTGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAGGACGGCG^^ 
428 CCCTAGGGGGCGCTGCCAGGGCCtTGGCGCATGGOTCCGGGTTCTGGAGGACGGCGTGAA 



wo 96/05315 



PCr/US95/10398 



57/89 



113 


PIO 


114 


DKl 




TIO 




SW2 




HK4 




SAIO 


110 


S45 


123 


P8 




T3 


1 5A 


HK3 




HKS 


n n D 


DR4 


104 


DSll 


^ nc 
lUa 


S14 


IUd 


SWl 


1 m 

XU ' 


S18 


XU J 


DK7 


1 - 194 




ei?r* Tn Mn • 


TSQIATE 




S9 


in 


IRD3 


118 


II<ID8 


111 


Dl 


112 


use 


113 


PIO 


114 


DKl 


115 


TIO 


116 


SW2 


122 


HK4 


109 


SAIO 


110 


S45 


123 


P8 


124 


.T3 


120 


/HK3 


121 


HKS 


108 


DR4 


104 


USll 


105 


S14 


106 


SWl 


107 


S18 


103 


DK7 


103-124 


consenauB 



7ZGUU 6C 

28 CCCTAGGGGGCGCTCK:aU«K5CCCTGGCGCATGGCGTCCCK^^ 
28 CCCTJUMGGGCCKTOCCACWGCCCTGGCGCATCXXrG^ 
28 CCCTAGQGGGCGCroCCAGGGCtCTMCaCATGGtGTCCGGGT^^ 
28 CCCTAGGGGGCGCTOCCAGGGCCCTtXKrgCATGQcGTCC^ 
.28 CCTTAGGGTCCGtTGCa^aGCCCTOGCaCATCWtGTC^^ 
28 CtTTAGGGGGCGCTGCakGgGCCTTGGCGCATGGCOT^ 
28 CCCTAGGGGGC^ 

28 CCCTAGGGGGCGTTGCCAGOGCCCTOK^CATGGCGTCCGGGT^^ 
28 CCtTWKXKSGCGTTGCCAGGGCCCTGGCGCATGGCGTCCGGGT^^ 
28 CCCTAGGGGGCOTTGCCMAGCrtTGGCACATGG^^ 
28 CCCTAGGGGGCGTTGCCAGAGCCCTGGaUaUrGGTGTCra 
,28 CCCrtGGGGGCGCTGCOUKKKrCCTCGCGCATGGCG^^ 
,28 gtPrCGGAGGCGCTGCaU3GGCCCTGGCGCATGGCGTCCGGi»a^X'C^^^ 
28 CcCTCGGgGGCGOTCCAGGGCCCTGGCGCRTGGCGTCXGGGTTC^^ 
28 CTCTtGGAGGCGCTGCCaGGGCCCOTGCGCATGGCGTCCG^^ 
28 CTCTcGGAGGCGCTGCCAGGGCCCTOGOGCATGGCGTCCXWOT 
,28 CTCrtGGAGGCGCTOCCAGGGCCCTGGCGCATGGCGTCCGGGT^ 

CcCTaGGgGGcGcTGCCAGgGCccTGCCgCAtGGcGTCCGgGTtC^^ 



489 CTATGCiUlCftGGGJUlCcTcCCCGOTiti^^^^ 

489 CTATCC3UU:JtfX»AACTTGCCaxnTG C^ 

489 rTATCgAACftGGGAACTrGCCCG Gn ' lXiCrLT i Ib G ciim CIATCC 

489 "t^-rSSfiGGSfc l UlLiA lVriLLl CTT G G CITTG CTGTCC 

t il ctatSSaSgg^^ 

489 CTATGCAAOtfXXaUlTcTGCCCGGTOCTCT^^ 
489 CTAcGCAACAGGGAATTrGCCCGGTTGCTOTTC^ 

489 CTATGCAACAGGGAATcrrGCCCGGTTGCrCcOTTr^^ 

489 EStSSaScG^ 

489 CrartKyUCAOGGAATTTGCCCGCnTO^ 

489 CTATGCAACAGGGAATCTGCCa;UlUl, LitriaLrcrA TOT 

489 CTATGCARCAGGGAATCTGCCTGGTIGCTCT^^ 

489 tTAcGCWlCAGGGAATTroCCrGGTTG CTCTTr C^^ 

4 89 CTAtGCAACAGGGAATTTACCCGGTTGCn rTTTy rrCT 

489 CTAcGOUtflAGGGAATaTACCa KyrrGCTCTITCTCTAT^^ 

4 89 crrATGCAACAGGGAAlVlUVCTlalj'ritiW'AV iTl'C I'L i'Aiv xa CCi'i"iiiiGwAi"i\jd' w'l'v. i 

ill ctatcoSSgggaac^ 

cTAtGCSUUMGGAAtrtgCCeGGTTGCtCtTTcrrCTATCrrcCTetT^^ 



SEO ID NO: 






119 


S9 


sso 


117 


IND3 


550 


118 


IND8 


550 


111 


Dl 


550 


112 


U56 


550 


113 


PIO 


550 


114 


DKl 


550 


115 


TIO 


550 


116 


SW2 


550 


122 


KK4 


550 


109 


SAIO 


550 


110 


S45 


550 


123 


P8 


550 


124 


T3 


550 


120 


HK3 


550 


121 


HKS 


550 



TGTTTGACCATCCCAGCTTCCGCT 
TGTTTGACCATCCCAGOTCOGCT 
TGTTTGACCgTCCCACOTPCCGCr 
TGTTTGACCATCCCAGOTrCCGCT 
TGrrnGACCATtCCAGCTTCCGCT 
TGcCTGACCATCCCAGCgTCCGCT 
TGTtTGACCATCCCAG CTTC CGCc 
TGTCTGACCATCCCAG CTTC CGCT 
TGTCTGACCATCCCAG CTTC CGCT 
TGTTTGACCATCCCAGCTTCCGCT 
TGTTTaACCATCCCABCITCCGCT 
TGCTTOACCATCCCAGCTTCCGCr 
TGtCTOACCATCCCAGCTTCCOCT 
TGCTTGACCATCCCAGCTTCCGCT 
TGCITGACCACCCCAGCTTCCGCT 
TGtcTGACCACCCCAGtTTCCGCr 



wo 96/05915 



PCTAJS95/10398 



58/89 
PIGOKS 6C 

108 DR4 550 TGCtTGACCGTGCCCGCaTCgGCC 

104 OSll 550 TGCCTGACTOTGCCCG CTTCA GCC 

105 S14 550 TGCCTOACTCrrGCCCGCTTCAGCC 

106 SWl 550 TGCCTGACaGTGCCCGCCTCAOCC 

107 S18 550 TGtCPGACtGTGCCCGCGTCAGCt 
103 DK7 550 TGcCTGACcGTGCCCGCtTCgOCc 

103-124 consensus TGttTgACcatcCCaGctTCcGCt 



r\ f\ ^^ 



wo 96/05315 



PCT/US95/1039S 



59/89 



PZSURX 6D 



SEQ ID NO: 
128 
125 
126 
127 

125*128 



ISOLATE 
T2 
T4 
0S1O 
T9 

consensus 



1 ATGJlgCACAAtTCCTAAACCTCJUUtfSAAAAACCAAAAGAAACACtAA^ 

1 ATQAGCAOUUVTCCTJUUUrCTCAAAGAAAAACCAJVAAGAAACACcAAC^^ 

1 ATGAGCACAAATCCTAAJ^CTCAAAGAAAAACCJUU^AGAAACACtAACC^ 

1 ATGAGCACAAATCCaAAACCcCMAGAAAAACCAtAAGAAACACcAACCGTOSCCCACA^ 

ATGAGCACAAaTCCtAAACCtCAAASAAAAACCAa A Afi AA APAC - AACCGTCGCCCACA - G 



SEO ID NO: 
128 
125 
126 
127 

125-128 



T2 
T4 

USIO 
T9 

consensus 



62 ACCrrrAAGTTtCCGGGCGGCGGCCUUSATCGTTGGCGGAGTATACTTG^^ 

62 ACGTTAAGTTcCCGGGCGGCGGCCAGATCGTrGGCGGAGTATAOTCTro 

62 ACGTTAAGTTtCCGGGCGGCCXiCCAGAawnXX^CGGAGTATA CTTGTTG CCGCGC^ 

62 ACGTTAAGTTcCCGGGCGGCGGCCAGATCGTrGGCGGAGTATACTTGTTGCCGra 

ACGITAAGTT - CCGGGCGGCGGCCAGATCGTTGGCGGJWTTATACTTG tTC 



SPO XP yO; 
128 
125 
126 
127 

125-128 



T2 
T4 
USIO 
T9 

consensus 



123 CCCCAGGTnX3GTGTGCGCGCGAaU«X3AAGACTTCGGAGCGg^ 

123 CCCCAGGTrGGGTGTGCGCGCGACAAOGAAGACTTCGGAGCGaTCCCAGCCACGTGC^^ 

123 CCCCJtfXTITGGGTCTGCGCGCGACAAGGAAGACITCGGAGCGGTCCC^^ 

123 CCCtJUXnTGG Ul GT GCG CaCGACAAGGAAGACTTCGGAGCGGTCCCAGCCAC^ 

CCCcAGGTTGGGrrGTGCGCgCGACAAGGAAGACrrCGGAGCGgTCCCAGCCaCGT^ 



SEQ ID NO: 
128 
125 
126 
127 

125-128 



T2 
T4 

nsio 

T9 

consensus 



184 a3CCAGCC»TCCCUUVAGATCGGCGCTCCMTGGCAAGTCCTOGGGJ^^ 
184 CGCCAGCCCATCCCCAAAGATCGGCGCrCCACTGGCAAGTCCTtKSGGAAAACCAG 
184 CGCCAGCCCATCCCCAAAGATCGGCGCcCCACTGGaUWCnCCTGGGGAAAACC^ 
184 CGCCAGCCCATCCCCAAAGATCGGCGCtCCACTGGCAAGTCCTOGGGAAAACaUX^ 

CGCCAGCCCATCCCcAAAGATCGGOSCtCCACTGGCAAGTCCTGGGGAAAACCAGGATAcC 



SEO ID NO: ISOLATE / \ 

128 T2 245 CCTGGCCXCTCTATGGGAATGAGGGgCTCGGCTGGGCAGGATGGCre 

125 T4 245 CCTGGCCCCnTTATGGGAATGAGGGACTCGGCTGGGCAGGATGGCTCCTC 

126 USIO 245 CtTGGCCCCTATATGGGAATGAGGGACTCGGCTGGOCAGGATGGCTC CTCTC 

127 T9 245 CcTGGCCtCTATATGGGAATGAGGGACTCGGCTGGGCgGGATGGCn^CTGTCCCCCCGAGG 

125-128 consensus ccTGGCCcCT-TATGGGAATGAGGGaCTCGGCTGGGCaGGATGGCTCCTGTCCCCCCGAGG 



S^g IP 1^; 
128 
125 
126 
127 

125-128 



T2 
T4 
USIO 
T9 

consensus 



306 - TTCtOnxrCCTCtTGGGGCCCCAATCACCCCCGGCATAGGTCGCGCAAtGTGGGTAA 
306" TTCCCiTreCCTCcTGGGGCCCaU^TGACCCCCGGCATAGGTCGCGCAA CCT 

306 TTCCCGTTCCTCTTGGGGCCCCAgTGAcCCCCGGCATAGGTCGCGC^ 

TTCcCGTCCCICtTGGGGCCCCAaTGAcCCCCGGCATACKnrGCGCAAcGTGGGTAAgGTC 



SEO ID NO: 

128 

125 

126 

127 



T2 
T4 

USIO 
T9 



125-128 consensus 



367 ATCGATACCCTAACGTGCgGCtTTGCCGACCTCATGGGGTACaTCCCCGTCGTAGGCGcCC 
3 67 ATCCyiTACCCTAACGTOCaGCcOTtSCCGACCTCATGGGGTACgTCCCCGTCGTAGGCGgCC 
367 ATCGATACCCTAACGTGCGGCTTIGCCGACCTCATGGGaTACATCCCCGTCGTgGGCGCtC 
367 ATCGATACCCTAACGTGCGGCTITGCCGACCTaVTGGGgTACATCCCCOTCGTaGGOT 

ATCGATACCCTAACGTGCgGCtTTGCCGACCTCATGGGgTACaTCCCCGTCGTaGGCGccC 



SEO ID NO: 
128 



T2 



4 28 CGcTtGGTGGtGTCGCCAGAGCTCTtGCGCATGGCGTGAGAGTCCTGGAGGACGGaGTTAA 



wo 96/05315 



PCT/US95/10398 



60/89 



PZOOU 6D 



125 
126 
127 



T4 

DSIO 
T9 



125-128 consensus 



428 CGtTgGGTGGCCrrCGCCAGAGCrCTCGCGCATGGCGTG3tf»AGTC 

428 CG CX '' I ' GG ' i G GCGTCGCCAGAGCTCTCGCGCATGGCGTGAGgGTCCTGGAGGACGGGGrniA 
428 CGCTTGGTGGCGTtGCaUaAGCTCTCGCGCAcGGarrGAGaGTCC^ 

CGCTtGGTGGcGTcGCCJlGAGCTCTcGCGCAtGGCGTGAOaGTCClXM^^ 



iSEP yg; 

128 
125 
126 
127 



T2 
T4 
USIO 
T9 



125*128 consensus 



489 TTATGCAACAGGtAA CTTA CCcGGTTGCTC LTm'CTA Tc TTCTTG CTaG^^ 
489 TTATGCMCAGGGMCTTACCtGi^^ 

489 TTATGCAACAGGGAACTTACCcGbTiXjCl'LC l'i'l'XC T AlVnVl'lt* ^^ 
4 89 TTATGCAACAGGGAACCTACCtGGTTGCTCtTTTTCTATCTTCrrGCTt^ 

TTATGCAACAGGgAACtTACC - GG"ri^CTCc'iTiU'CrATC'nX;i"lUCTgGCCcTaCTGTCC 



SEP ID NO: 

128 

125 

126 

127 



T2 
T4 

asio 

T9 



550 TGCATCACtATTCCgGTtTCaGCT 
550 TGCATCACCATTCCAGTCTCcGCT 
550 TGCATCACCATTCCAGTCTCTGCT 
550 TGOlTCACCACTCCgGcCTCTGCT 



125-128 consenBus 



TGCATCACcAtTCC - GtcTCtGCT 



wo 96/05315 



PCT/US95;i039S 



61/89 
PXGORS 6E 



SEP ID NO: 
131 
132 
133 
129 
130 

129-133 



ISOLATE 
DKll 
5W3 
DK8 
TB 
OSl 

consensus 



1 ATGAGCACAAATCCTAIU^CTOUUIGAAAAACCAAAAGAAATACJUU^ 
1 ATGAGCACMATCCTiUUUrcrrCAAAGAAAAACCAAiUUj^^ 

1 ATGAGCACAAATCCTAAArCTCAAAflAAAAAC Ca & a a n a I^IiC3^CJ^IiACCGCCGCCCAC2kCG 
1 ATGAGCRCAAATCCTAAACCTCA^VGAAA AflC^a a aafiR aftCACAAACCGCCGCCCACAiSG 
1 ATGAGCACAAATCCTAAACCTCAAAGAAAAACCAAAAGAAACACAAACCG^ 

ATGAGCACAAATCCTAAACCTCAAAGJUU^AACCAAAAGAAAcACAAACCGCCGCCCACAGG 



131 
132 
133 
129 
130 



DKll 
SW3 
DK8 
TB 
USl 



129-133 consensus 



62 ACGTrAAGTTCCCGGGTGGCGGCCAflATCGTrGGCGGA GTTTA LT^ 
62 ACGTTAAGTrCCCGGGTGGCGGCCAGATCGTT^ 

62 ACGTCAAGTrCCaXXnTOCGGCaUSATCGTTGGC^^ 

62 ACGTCAAGTTCCCGGGTGGCGGtaiGATCGTrGGCGGAGTTTAC^ 

ACGTtAAGTrCCCGGGTGGCGGcCAGATCGTTGGCGGAGTrrACTTGCTGCCGC^ 



5gQ IP TO; 

131 
132 
133 
129 
130 



DKll 
SW3 
DK8 
T8 
USl 



129-133 consensus 



123 CCCCAGGTTGGGTGTGCGCaCGACAAGGAAGACTTCCGAGCGATCCCAGCCG CGTO 
123 CCCCAGGTTGGUTtfiXjC GCG CGACAAGGAAGA CT TCCGAGCGATCCCAGCLXj^ ^ 
123 CCCCASGTTGGGTGTGCGCGCGACAAGGAAGt CTTC CGftGCGATCCCAGCCGCGT^ 

123 CCCcAGGTTGGGTCnGCGCGCGACAAGGAAGACTTCCGAGCGATCCCAGCCGCGT^^ 

CCCcAGGTTGGGTGTGCGCgCGACAAGGAAGaCTTCCGAGCGATCCCAGCCGCGT^ 



SEP ID WO: 
131 
132 
133 
129 
130 

I2'9\l33 



DKll 
SW3 
DK8 
T8 
USl 

consensus 



184 CGCCAGCCCATCCCGAAAGATaKSCGCTCCACCGGOlAGcCCTGGGGAAWSCC^^ 
184 CGCCAGCCCATCCCGAAAGATCQGCGCTCCACCGGCAAGTCCrrGGGGAAAGCCAGGATATC 
184 CGCCAGCCCATCCCGRAAGATCGGCGCTCau:CGGCAAGTCCTOGGGAAAACCg<^ 
184 CGCOUSCCCATCCCGAAAGATCGGCGCTCCACCGGCAAGTCCTGGGGAAAACC^^ 
184 CGCCAGCCCATCCCGAAAGATCGGCGCTCCACCGGCAAGTCCTGGGGAAAgCCAGGATATC 

CGCCAGCCCATCCCGAAAGATCGGCGCTCCACCGG 



SEQ IP 

131 
132 
133 
129 
130 



DKll 
SW3 
DKB 
T8 
USl 



129-133 consensus 



245 CTTGQCCCCTGT A TGGAAACGAGGGCTGCGGCTGGGCAGGTTGGCTC CTGTC CCCCC 

245 CTIXXX:CCCTGTATGGAAACGAGGGCTGCGGCTGGGCAGGTO 

245 CTTGGCCCCTGTATGGAAACGAGGGCTGCGGCTGGGCAfiCmxXSCreCT 

245 CTTGGCCTCTtTACGGAAACGAG<X»CPGCGQtTGGGCAGGTTG^ 

245 CTrGGCCrCTgTACGGAAACGAGGGCTGCGGcTGGGCAGGTrGGC^ 

CTTGGCecCTgTAtGGAAAOSAGGGCTGCGGCTGGGCaGGTTGGCT^ 



SEP ID NO: 

131 

132 

133 

129 

130 



ISOLATE 
DKll 
SW3 
DK8 
T8 
USl 



129-133 consensus 



306 GTCTOlTCCTAATIGGGGCCCaVCnXSACCCCCXXSaiTA 

306 GTCTCATCCTAATItKXWCCCaiCTCACCCCCGGCATAGATCA^^ 

306 GTCTCGTCCTACTIXXSGGCCCCACTGACCCCCGGOITAGATC^ 

306 GTCrCGTCCTACTTOKXSCCCCACTGACCCCCGGaiTAGATCACGtAATIT^ 

306 GTCTCGTCCTACTTGGGGCCCCACTGACCCCCGGCAcAGATCACGTAACT^^ 

GTCTCgTCCTAcTTGGGGCCCCACTGACCCCCGGCAtAgATCACGcAAtT^ 



SEP ID NO: 

131 

132 

133 

129 



ISOLATE 
DKll 
SW3 
DK8 
T8 



3 67 ATCGACACCATTA CGlHjlUjlTX ' l^j CCGACCTCATGGGGTACAT^C CTCT C GTc^ 

367 atcgaScSttacgtgtggttttgccgacctcatggggtac^^ 

367 ATCGATACaVTTACaTGTGGTTrTGCCGACCTCATGGGGTACATCCCTGTCGTITC 



MWJ.t 



wo 96/05315 



PCT/US9S/1039S 



130 USl 
129-133 consensus 



62/89 
FIGORS 62 

367 ATCGATACCATTACg'lXilX;G"l"l"riXiCCGACCTCATGGGGTACATCCC"^^ 
ATCGAcACa^TTACgr itjUWrxn 'r G CCGACCTCATfflSGCTACW 



131 
132 
133 
129 
130 

129-133 



DKll 
SW3 

DKe 

T8 

nsi 

consensus 



428 CGGtCGGAGGCGTCGCCAGAGCICTGGCAO^CG GTGCTA GAGTC 

428 CGGTCGGAGGCGTCGCCAGAGCTCTGGCACACG gTGTTA GAGT CCTTC 

428 CGGTtGGAGGCGTCGCCAGAGCTCTGGCACACGGTGTTAGGGTCC^^ 

428 CGGTCGGAGGCGTCGCCAGAGCTCTGGCACAtGGrnSTO^^ 

428 CGGTCGGJUWCtrrCGCCAGAGCTCTGGCRCAcGOTGTTAGGGTCCTGGAA 

CGGTcGGAGGCGTCGCCAGAGCTCTGGCACAcGGTGTTAGgGTCCTGGAA^ 



SEP IP TO; 

131 
132 
133 
129 
130 

129-133 



I50IATB 
OKll 
SW3 
DK8 
T8 
OSl 

consensus 



4 89 TTACGCAACAGGGAATCTGCLntJGriUCT CrriU'i C'XAT CTTCTTACTTG C 
4 89 TTACGCAACAGGGAATCTGC Clt^ G TT G CTCT ri ' lUV lA T LnuVlU ' A CTTG C 

4 89 TTACGCAACAGGGAATTrGCCltj<j"ri\aL'iLl i I'i l'CTATL'l"itJ'l'lXjLl'i\>C 

489 CTAtGCAACAGGGAATrrGC LUXiG ' f TGCTCr TTTTCTATCrrC 

489 tTRrGrAArAGffGAAT*"^^*^^ ■it^^.^'in'UL: A'IXJri'L'i'i'auA"A\jWA u i a v, i.\»AA,q 

tTAcGCAACAOGGAATcTGCCltWnXiC TC tlTl'l L 1 Al'Cl'i'LTraCri'GCTCl'l'ClXTlVg 



SEQ Xff TO: ISQIAIS ,,..^^r^_^r^ 

131 DKll 550 TGCTgCACAGTGCCA GTGTCTG CG 

132 SW3 550 TCCTtCACAGTGCCAGTGTCTGOG 

133 DK8 550 TGCTgCACAGTGCCA GTGTCTG CG 

129 T8 550 TGCTtCACAGTGCCAGTGTCTGCA 

130 USl 550 TGCgcCACgGTGCCgGTGTCTGCA 

129 - 13 3 consensus TGCt - CACaGTGCCaGTGTCTGCg 



wo 96/05315 
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PXOTRS 6F 

SEP ID NO: ISQIATE 

X31 Dial 1 ATGAGCACAAATCCTAAACCTCAAAGAAAAACCAAAAGAAATACAAACCGCCGCCC^^ 

132 SH3 1 ATGJ^^CACJUUVrCCTAAACCTCRAaGAJUUUVCCAAAAGftAATJ^^ 

133 DKS 1 ATGAGCACAAATCCTAAACCTauU^aAAAAACCAAAAGJUU^ 

129 T6C 1 ATGAGCACAAATCCTJUUVCCTCAAJU3JUUU ^ CAAAAGAAAa^^ 

130 USl 1 ATGRGCACAAATCCrAAACCrrCRAAGAAAAa C CAAAAGAAACACAAACCGCCGCCCJtf^ 

125 T4 1 ATGAGCACAAATCCrAAACCTCJUUtfjAAAAA C CAAAAGAAACACcAA^^ 

126 US 10 1 ATCSAGCACAAATCCTAAACCTCAAAGAAAAACCAAAAGAAACACtAACCGTCG^^ 

127 T9 1 ATGAGCACAAATCCaAAACCcCAAAGAAAAACCAtAAGAAACAC c AACCCnCTCCCACAi^ 

128 T2 1 ATGAGCACAAtTCCTAAACCTCAAAGAAAAA C CAAAAGAAACR C TAACCGTCGCCCACAaG 

134 SB3 1 ATGAGCACAAaTCCTAAACCTCAAAGAAAAACCAAAAGAAACA C TAACCGcCGCCCACA^ 

125-134 consensus ATGAGCACAAaTCCUUU^CtCAAAGAAAAACCAaAAGAAAci^aAACCQcCGC^ 



SEP ID NO: ISOLATE 

131 DKll 62 ACGTTAAGTTCCCGGGTGGCGGCCAGATCGTTGGCGGAGTTTAt'l'lULn^ 

132 SW3 62 ACGTTAAGTTCCCgGQTGGCGreCCAW rCGri^S GCGGA GTrT A L "^^ ^ 

133 DK8 62 ACGtTAAGTTCCCGGGTGGCGGCCAQAlX A»n'lXj GCGGA STTTRCrTG CT 

129 T8 62 ACGTCAAGTTCCCGGGTGGCGGCCAGATCGTIGGCGGJWSTTTACTTGCT^ 

130 DSl 62 ACGTOU\GTTCCCGGGTGGCGGtCAGATCGTTGGCGGACrrTTACT^ 

125 T4 62 A CG TT A A G TT C CCX;GGCGGCGGCCftES Ai nJ U riX aG CGGAgIATA LT^ 

126 USIO 62 ACGTTAAGTTtCCGGGCGGCGGCCAGATCGTTGGCGGAGTATACTTGTIG^^ 

127 T9 62 ACGTrAAGTTcCCGGGCGGCGGCCAGATCGTTGGCGGWITATACTr^^ 

128 T2 62 ACGTTAACnTtCCGGGCGGCGGCCAGATanTGGCGGAGTATAOTGCTC 

134 S83 62 ACC3TcAAOTTcCCGGGCGGt(%CaU3ATC6TTCK*CGGiUnA^^ 

125-134 consensus ACCnrtAAGTTcCCGGG-GGcGGcCAGATCCTrTGGCGGAGT-TACTTGcTGCCGCGCAGC^ 



SEP ID WO: ISQIATE 

131 DKll 123 CCCCAGGTIX3GGTGTGCGCaCGAauUCK*AAaACrTCCQAGCGATC 

132 SH3 123 CCCCACX7rroGG7GTGa3CX;CGACJUU3GAAGACT^ 

133 DK8 123 CCCCAGGTTGGGTGTGCGCGCGACAAfiGAAG tCTrCCGAGCGATCCCAGCCGCGTGGQAOg 

129 TS 123 CCCtAGGTTGGGTGTGCGCGCCU^CAACXSJUUQACTTCCGM^^ 

130 USl 123 CCCO^GGTTGGGTGTGCGCGCGACAAGGAAGACTTCCGAGCGATCCCiUSCCGCGTGGG^^ 
^ 125 T4 123 CCCCAGGTTGGGTGTGCGCGCGACAABGAAGACTTCGGA6CGATCCCAGCCACGTGGG 

12 6 USIO 123 CCCCJtfXrriXkKrrOTGCGCGOTACA^^ 

128 T2 123 CCCcJtfXrrTGGGTgTGaSCGCGACAAGGAAGACTTCGGAGCGGTCCC^^ 

134 S83 123 CCCgAGaTTGGGTGTGCGCGCGACgAGGAAaACTTCcGAaCGGTCCCAGCC&CGTGGgAGG 

125-134 consensus CCCcAGgTroGGTGTGCGCgCGACaAGGAAgaCTTCcGAgCGaTCCCAGCCgCXnt^g^ 



SEP ID NO: ISOLATE 

131 DKll 184 CGCCAGCCCATCCXrGAAAGATCGGCGCTCCACCGGCAAGcCCTGGGGAAAGCCAGGATATC 

132 SW3 184 CGCCAGCCCATCCCGAAAGATCGGCGCTCCACCGGCAAGTCCTGGGGAAASCCAGGATATC 

133 0K8 184 CGCCAGCCCATCCCGAAAOATaXXrGCICCACCGGCAAGTCCTGGGGAAAACCgGGATATC 

129 T8 184 CGCCAGCCttTCCCGAAAGATCGGCGCTCCACCGGOUUrrCCTCGGGAAAAC 

130 USl 184 CGCC»GCCCATCCCGAAAGATCGGCGCTCCACCGGCAAGTCCTGGGGAAAgCC»GGATATC 

125 T4 184 CGCCAGCCCATCCCCAAAGATCGGCGCTCXSlCrroGCAAGTCCTGGGGAA^^ 

126 USIO 184 CGCOtfSCCCATCCCOUJWCaiTCGOCGCcCCACTGGCAAGTCCTGGGGJU^ 

127 T9 184 CGCCAGCCCATCCCCAAAOATCGGCGCTCCACTGGCAASTCCTGGOGAAAACCAGGATACC 

128 T2 184 CGCaUSCCCATCCCTAAAGATCGGCGCTCCACTGGCAAGTCCTGGGGA^^ 

134 S83 184 CGCCAGCCpVTCCCTAAAGATCGGCGCaCCACTGGCAAGTCCIXKSGGAAggCC^ 

125-134 cons ensue CGCCAGCCCATCCCgAAAGATCGGCGC tCCAC - GGCAAGtCCTGGGGAAaaCCaGGATAtC 



SEP ID NO: ISOLATE 

131 DKll 245 CTTGGCCCCTCTATGGAAACGAGGGCTGCGGCTGGOCAaGTTGGCTCCTGTCCCCCCGCGG 

132 SW3 245 critXiCCCCrGTATGGAAACGAGGGCTGCGGCTGGGCAGGTTO 

133 DKB 245 CTTGGCCCCTGTATGGAAACGACXXjCTGCGGCTGGGCAGCrn^ 



uanj 
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FXGDM 6P 



130 


USl 


125 


T4 


126 


USIO 


127 


T9 


12% 


T2 


134 


S83 


125*134 


consensus 


SEO ID NO: 


TSOLATS 


131 


DKXl 


132 


SW3 


133 


DK8 


129 


T8 


130 


USl 


125 


T4 


126 


USIO 


127 


T9 


128 


T2 


134 


SB3 


125-134 


consensus 


SEO ID NO 


; ISOLATE 


131 


DKll 


132 


SW3 


133 


DKB 


129 


T8 


130 


USl 


125 


T4 


126 


USIO 


127 


T9 


128 


T2 


134 


, S83 


125-134 


consensus 




306 grrCTCATCCTRATTGCXKSCCCaU^ 

306 GTCTCATCCTAATItXKKK:CCCACTQACCCa 

306 tmTKCTCCTJCTTGGGGCCCCACTC^CCCCC^ 

306 CTtnrGTCCTACTrGGGGCCCCACTGACCCCCGGCATAGAT^^ 

306 GTCTCGTCCTACTTGGGGCCCCACTCy^CCCCGGCAcAGA^ 

306 TrCCCGTCCCTCcTGGGGCCCO^TGWrCCCCCXSCATACKnCGCGCAAC^^ 

306 TTCCCGTCCCTCITGGGGCCCCAcOTAtCCCCGGCATACKnrGCGCAACC^^ 

306 TTCCCGTCCCTCrrGGGGCCCCAgTCACCCCaXJCATAWn^ 

306 TTCTCGTCCCTCTTCGGGCCCakaTGACCCCCCWCATAGGTCGC^ 

306 rrCTCGcCCtTCaTGCXWCCCCAccGACCCCCGG<MAaaTCGC:GCA^ 

- TC t Cgt CC t - ctTGGGGCCCCActGAcCCCCGGCAtAgaTC - COcAA- tTGGG tAa - GTC 



367 JirCGIiCACCXTrJkC U ' l\jfi\M^ ^ 

367 ATCGACRCCATTAC GlVAUalT^ 

367 ATCC5ACACCArrAC GnVAtJGTiUUi:i CCGACCnxaT^^ 

367 ATgGATftCCMTACa lX;iU>lTAlU CCGACCTCATGG^ 

367 ATCGATACCATTAarrGTGGTrrTGCCGACCrCA^^ 

367 ATCGATACCCTAACGTGCaGCcTTGCCGACCTCATGOGGTACgTCCCCGTCGTaGGCG^ 

367 ATCGATACCCTAACCmSCGGCTTrGCCGACCTCAT^ 

367 ATCGATACCCTAACCrPQCGGCTTOCC^ 

367 ATCGATACCCTAACGTGCGGCrrroCCGACCTCATGGGGTACATC 

367 ATCGATACCCTAACGTGCGGtTTTGCCGACCTCATGGGGTACATaCCCGTCGT 

ATCGAtACC - T - ACgTG - gGttTTGCCGACCTCATGGGgTACaTcCC - GTCGTtGGCGccC 



SEP ID NO: 

131 

132 

133 

129 

130 

125 

126 

127 

128 

134 



ISOLATE 
OKll 
SW3 
0K8 
T8 
USl 
T4 
USIO 
T9 
T2 
S83 



125-134 consensus 



428 CGGTaXSAGGCGTCGCCAOAGCTCTGGCACACGGTGTTAGAGTCC^^ 
428 CXSGTCGGAGGCGTaSCCAGAGCTCTGGCACACGGTCTTAGAGTCCTGGAA^ 
428 CGCTtGGAGQCGTCGCCAGAGCTCTGGCACACGGXGrrAOQGTCCTG^ 
428 CGGTCGGAGGCCmXSCCAGAGCTCTGGCACAtGGTCnrrAGGGTCCTGC»AGATO 

428 CGtTg<KrrGCXOTrCK:CAGAGCTCrCGa5«TCXSCGTGA^ 

428 CGCTTGGTGGCGTCGCCAfiAGCTCTCGCGCATWrGTGAOg^ 

428 CGCTTGGTGGCGTtGCCAGAGCTCTCGCGCAcGGCGTGAGAGTC^^ 

428 CGCTTCGTGGtGTcOCauyVGCrCTtGCGCATOGC^^ 

428 CcgTITOcGGcGTtGCOWGAGCcCTcGCcCATGGgGTGAOgGTtC^^ 

CggTtOGaGGcGTcGCCAGWK:tCTgGCaCA.GGtGT-AG-GTcCTGGA-GACG^ 



SBO IP J^Q; 






131 


DKll 


489 


132 


SW3 


489 


133 


DKB 


489 


129 


T8 


489 


130 


USl 


489 


125 


T4 


489 


126 


USIO 


489 


127 


T9 


489 


128 


T2 


4 89 




T L ' rrrncrA T L - rrLTiA CTTGC 
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134 S83 
125 - 134 consensus 



Fionu 6F 

489 TTATGCAACgGGgAAtTrgCCCGGTTGCTCtTrcTCTATOT 

tTAtGOUCaGGgAAttTgCCtCGTrcCTCtTTtTCTATCTTCtTgcTtGC - cTtcTGTCc 



131 DKll 550 TGCTgCACAGTGCCAGTGTCTGCG 

132 SW3 550 TGCTtCACAGTGCCA GTQT CTGCG 

133 DK8 550 TGCTgCACAGTGCCAGTGTCTGCG 

129 T8 550 TGCTtCACAGTGCCAGTGTCTOCA 

130 USl 550 TGCgcCACgGTGCCgGTGTCTGCA 

125 T4 550 TGCATCACCATTCCAGTCTCcGCT 

126 OSIO 550 TGCATCACCATTCCAGT CTCTG CT 

127 T9 550 TGCATCACCAcTCCGOcCTCTGCT 

128 T2 550 TGCATCACTATTCCG GTTT CaGCT 

134 S83 550 TGCATCtCTgTgCCAGTTTCcGCC 



125-134 



consensus 



TGCatCaCagtgCCaGtgTCtGCt 
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PZ6DSS 60 



SEQ IP TO; 

138 

135 

136 

137 



0K12 
HKIO 
S52 
S2 



135-138 conBensus 



1 ATGAGCACACTrCCTJUUlCCTCAAAGAAAAACCARAfiGAAACACCftTCC 
1 ATGJUjCACACTTCCTAAACCTCAAAGAAAAACCAAAAGAAACACC^ 
1 ATGAGCACACTTCCTJUUlCCTCAAAGAAAJU^aUUU^ 
1 ATGAGCACACTTCCrJUUCCTOUJlGAAAAACC^ 

ATGAGCACACTTCCTAAACCTCAAAGAAAAACCAAAAGAAACACCATCCGTCGCCCACAGG 



IP NO; 

138 
135 
136 
137 



DKI2 
KKIO 
S52 
52 



135-138 consensus 



62 ACGTcAAGTTCCCGGGTGGCGGACAGAlXAjriX ^ G' ny SACnA^ 

62 ACGTTAAGTTCCCGGGTGGCGGACAQATCGTTGG1XXjA£?I^^ 

62 ACGTTAACnTCCCGGGTGGCGGAa^lU»'n\:]FGit3GAGTATAL ^^ 

62 ACaTcAAGTTCCCGGGTGGCGGACAGATOnTGGTGGAGTATACGTGTTGCCGCGC^^ 



ACgT-AAGTTCCCGGGTGGCGGACAGA'; 



^rACGTGTTGCCGCGCAGGGG 



SEQ ID NO: 

138 

135 

136 

137 



DiC12 
KKIO 
552 
32 



135-138 consensus 



123 CCCACGATTCXXntnt^CGCGCOACGCGTAAAACTTCTQAACGGTCaCAGCCT 
123 CCCACGATTCGGTGTGCGCGCGACGCGTAAAftCTTCTGAACGGTCgCAGC CT 
123 CCCACGATTGGGTGTGCGCGCGACGCGTAAAACrrCTGAACGGTCACAGCCTCGC^ 
123 CCCACGATTGGGTGTGCGCGCGACGCGTAAAACTTCTGAACGGTCACJUSCCT 

CCCACGATTGGGTGTGCGCGCGACGCGTAAAACTTCroAACGGTC 



138 
135 
136 
137 



DK12 
HKIO 
S52 
52 



135-138 consensus 



184 CGACAGCCTATCCCCAAGGCGCGTCGGAGCGAAGGCCGGTCCTGGGCTCAGCCtGGGTACC 

184 CGACaCCCTATCCCCAAGGCGCGTCGGACCGAAGGCCGGTCCTGGGCTCAOCCCGGGT^ 

184 CGACAGCCTATCCCCAAGGCGCGTCGGAGCGAAGGCaxrrCCTGGGCTC^ 

184 CGACAGCCTATCCCCAAGGCGCGTO^GAGCGAAGGCCGaTCCTGGGCTC^ 

CGAa^CCTATCCCCAAGGCGCGTCGGAGCGAAGGCCGgrrCCTGGGCTC^ 



SEO^ID NO: 
138 
135 
136 
137 

135-138 



ISQIATE 
DK12 
HKIO 
S52 
S2 

consensus 



245 CTTGGCCCCTCTATGGTAJlCQAGGGCrGCGGGTGGGCAGGgT^ 
245 CTTGGCCCCTCTATGGTAACGAGGGCTGOyxrrGGGCaGGaT^ 
245 CTnSGCCCCTCTATGGTAAtGACGGCTGCGGGTGGGCAGGGTGGC^ 
245 CTTGGCCCCTCTATGGTAAcGAGGGCTGCGGGTGGGCAGGCjrGG 

CTTGGCCCCTCTATGGTAAcGAGGGCTGCGGGTGGGCAGGgTXXSC^ 



5SQ IP WP: 

138 
135 
136 
137 



DK12 
HKIO 
S52 
S2 



135-138 consensus 



306 CTCCXGTCCATCTTGGGGCCCAAACGACCCCCGGCGgaGGTCCCGC^ 
306 crC(XGTCCATCTIXKXK5CCCAAACGACCCCCGGCGacGGTCCCGCAA 
306 CTCCCGTCOlTCTTGGGGCCCAAACGACCCCCGGCGGAGGTCCCGCAAm 
306 CTCCCGTCCATCTItX;GGCCC»AAtGACCCCCGGCGGAGGTCCCGCAATTr^^ 

CTCCCGTCCATCTTGGGGCCCAAAcGACCCCCGGCGgaGGTCCCGCAATTrGGCT 



SEQ ID NO: 

138 

135 

136 

137 



ISOLATE 
DK12 
HKIO 
S52 
52 



135-138 consensus 



367 ATCGATACCCTcACQTGCGGATTCGCCGACCTCATGGGGTACATCCCGCTCGTCGGCGOT 
367 ATCGATACCCTTACGTGCGGATTCGCCGACCTCJlTOGGgTACATCCCG CTC 
367 ATCGATACCCTTACGTGCGGATrCGCCaACCTCATGGGGTACATCCCGCTCGTCGGCT 
367 ATCGATACCCTTACGTGCGGcTTCGCCGACCnxyiTGGGGTACATCCCGCT^^ 

ATCGATACCCTtACGTGCGGaTrCGCCGACCTCATGGGGTACATCCCGCTC^^ 



SEQ ID WOr 
136 



ISQIATE 
DK12 



428 CtGTAGGgGGCGTCGCAAGAGCCCTCGCGCATGGCGTGAGGGCCCTTGAAGACGGGATAAA 



utnj 
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FIGURE 60 

135 HKIO 42B CCCTTAGGAGGCGTCGCAAGAGCCCTCGCGCATGGCGTGAGGGCCCE^^ 

136 S52 428 CCGTAGGAGGCGTCGCAAGAGCCCTCGCGCATGGCGTGAGGGCCC7TGAAGACGGGATAAA 

137 S2 428 CCGTAGGiU^GCGTCGCAMSAGCCCTCGCGCATGGCGTtU^GGGCCCTT^ 

1 3 S - 1 3 8 consensus CcGTAGGaGGCGTCGCAAGAGCCCrOSCGCATGGCGTGJU^GGCCCTXTU^^ 

SEP ID NO: ISOLATE ■ 

138 DK12 4 89 TTTCGCAACAGGGAA CnX jCCCG GriXi CTC LnVin' CT ATCTT C CTTL'i'lt iCl X^^^ 

135 HKIO 4 89 TTTCGCJU^CAGGGAA CriXi CCCG UlHXj CTC LVl'lTLUAlVXn' C CTi'C^ 

136 SS2 489 Vl\'l\aCAJiCIiGGGJiJ^i'lXiCCCGUVl\MJ^VCCVi'X'^^ 

137 S2 489 I ' rrilj CAACJUXXWlCTrGCCCGGTTGCTCt rrAUVl ' A T LIUL^ ^ ^ 

135*138 consensus TTT - GCAACAGGGAACriXiCCCUGTIX^C IC C'l'i'i'i'Ci'A'l'C'l'l'CLTl'Ci'lXiC t Clti'i'lVi'C t 



SEP ID NO: IS2L&I£ 

138 DK12 550 TGCCTAATTCATCCAGCAGCTAGT 

135 HKIO 550 TGCTTAATTCATCCAGCAGCTAGT 

136 SS2 550 TGCTTAgTTCATCCtGCAGCTAGT 

137 S2 550 TGCTTAaTTGATCCaGCAGCTAGT 

135-138 consensus TGCtTAaTTCATCCaGCAGCTAGT 



wo 96/05315 
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68/89 
PZGDRS 6B 



^EO ID NO: 

145 

143 

144 

140 

139 

142 

141 



DK13 
Z6 
Z7 
Z8 
Z4 
Z5 
Zl 



139-145 consensus 



1 ATGAOCACGAATCCTJUU^CTCAAMAAAAACOUJ^^^ 

1 ATGAGCJlOGAATCCTAAACCTCAAAGAAAAACCAAAan'AACACC^ 

1 ATGAGCACGAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCARCCGCCGCCC^ 

1 ATGAGOUI^AATCCTAAACCTCAAAGAAAAACCAAACCnAACACCAACCGC 

I ATGAGCACGAATCCTAAACCrCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCCATGG 

1 ATGAGCACGAATCCTAAACCTCAAAGAAAAftCCAAACGTAACACCAACCGCCGCCCCATGG 

1 ATtaWSCACaAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGtCGCCCCATGG 

ATGAGCACgAATCCTAAACCTCAAAIUAftAACCAAACGTAACACCAACCGcCGCCCcATTO 



SBQ ID NO: 
145 
143 
144 
140 
139 
142 
141 

139-145 



62 ACGTTAAGTTCCCX3GGlXX3cGGCCAGATCCrrrGGCGGAGm 

Z6 62 AanTAAGTTCCCGGGTGGTGGCCAGATCGTTGGCGGAGTO 

Z7 62 ACGTTAAGTTCCCGGGCGQTGGCCAGATCGTTGGCGGA GTTTACTTGTTG CCGCGC^^ 

Z8 62 AtGTAAAaTTCCCaGaCOGcGGCOUSATCGTTGGCGGAOmAOT 

Z4 62 AcGTAAAgTTCCCGGGTGGTCGCCAGATCGTTGGCGGAGTm 

Z5 62 ATGTAAAATTCCCGGGTGGTGGtCAGArCC?ITGGCX3GAGTTC 

Zl 62 ATGTgAAATTCCCGGGcGGcGGcCAGATCOTTGGaXSAGTr^^ 

consensus AcGT-AAgTTCCCgGOtGGtGGcCAGATCOTroGCGGAGTrTACTO 



SEP ID NO: 
145 
143 
144 
140 
139 
142 
141 

139-145 



DK13 
Z6 
Z7 
Z8 
Z4 
Z5 
Zl 

consensus 



123 CCCtAGaTTGGGTGTGCGCGCOACTAGGAAGACTTCGGAGCXSGTTO 

123 cCCCAGgTIXXXntyrCCGCGCGACTAGGAAGACTTCGGAGCGGTCGCAACC^^ 

123 CCCCAGaTTGGGTGTGCGCaCaACTAGGAAGACTTCGGAGCGGTCGCAACC^^ 

123 CCCCAGGTTtKXrrGTGCGCQCGACTCGGAASACmXXSAGC^ 

123 CCCCAGGrTGGGTGTGCGCGCGACTCGaAAGACTTCOGAGCGGTCGCAACCTCCT 

123 CCCCAG U ' l ' mUG IG T G C G CGCOACTCGGAAGACTTCGGAGCGGTCGCAACCTCGcGGCASG 

123 CCCCcGGTTGGGTGTGCGCGCagCTCGGAAGACTTCGGAGCGGTCaCAACCTCG 

CCCcaGgrnXXmrroCGCgCgaCTcGgAAGACTTCGGAGCGGTCgCAACC^ 



SEP ID NO: 
145 
143 
144 
140 
139 
142 
141 

139-145 



DK13 
Z6 
Z7 
Z8 
Z4 
Z5 
Zl 

consensus 



184 CGCCAGCCTATCCCaUtfXK:ga3cCaActcGAGGGtAGGTCCT^^ 
184 CGCCAGCCTATCCCCAAGGOkCGTCGATCrcAGGGAAGGTCC^^ 
184 CGTCAGCCTATCCCaUUXra^CGTC» 

III CGTCASc™TCCCCAACOCgCGcCaGcCaGAGGGCAGa^ 
184 CXntMCCTATCCCCcAGGCaCGtCGGTC 

CGtCAgCCrATCCCCaAGGCaCOtCggcccGAGGGcAGgTCCTGGGCtCAgCCcGGGTAc^ 



145 
143 
144 
140 
139 
142 
141 



D1C13 
Z6 
27 
Z8 
Z4 
Z5 
Zl 



139-145 consensus 



245 CtTGGCCcCTTIACGGcAATGAGGGCTGCGGOTGGGCGGGATGGCTtrCTG^^ 
245 CATGGCCTCTITACGGTAATaAOOGTTGCGGGTGGGaSGGAT^^ 
245 »TGGCCTCTTTACGGTAAcGAGGGTITCGG^^ 

245 CTOGCCCCT^TATGGC» 

245 CTTGGCCtCirrATGGCAATGAGGGCTGTGGGTGGGCAGGGTGGCTCC^^ 
245 CTTGGCCcCTTTAcGGCAATGAGGGCrcPrGG^^ 

C tTtSGCCtCTtTAcGGcAAtGAgGGcTGcGGGTGGGCaGG - TGGCTCcTGTC - CCcCGcGG 



§£0 XP yO; 

145 
143 
144 

140 



DK13 
Z6 
27 
28 



306 CTCrCGgCCGTCTTOGGGcCCgAATGATCCraXK^gA^^ 
306 CTCTGGACCGTCTTGGGGtCaUATGATCCCCGGCGAAGGTCCCGCA^ 
306 crnriCGACCGTCTTGGGGCCCAAATGATCCXCGGCaAAGGTCCCXSCAAOT 
306 CTCTCGACCGTCTTtWGGCCCAAATGATCCCCGGCGGAGGTCGCGCAAm 



wo 96/0531S 
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pzoims 6B 



139 
142 
141 



Z4 

Z5 
Zl 



139-145 consensus 



306 CTCTCGGC(»TCTTGGGGCCCJU^TGATCCCCGGCGGAGaTCGCMCAATC^^ 

306 aTCTCXWCaiTCTTGGGGCCaAWlTCaiTCCCCGGCGTi^Kn'CCC^ 

306 tTCcaGGCCgTCTTGGGGCCccWlTGATCCCCGGCGTAGGTCCCGtAATCTCX^ 

cTCt cGgCCgTCTTGGGGcCcaAATGATCCCCGGCCgAGgTCcCQcAAt tTGGGTAAgCTC 



SEO_ ID^NQ: ISOLATE 

145 ^DK13 367 ATCGATACcCTAACTrGCGGcTrCGCCGAcCTCATCXKaiTACATC 

143 26 367 ATCGATACtCTAACTTGCGQcTTCGCCGAtCTCATGGGATACATCCOT 

144 27 367 ATCGATACCCTAACcTOCTCCTTtGCCGACCrCATGGGATACATCCCGCITGTAGGC^ 

140 Z8 367 ATCOATACCCTcACGTGCCKSCTTCGCaaACCTCATCWGATAC^^ 

139 24 367 ATCGATACCCTGACGTGCGGCTrCGCCGACCTCATGGGATAmTCCCGaTCGT^^ 

142 25 367 ATCGATACCCTGACGTGTGGCTTCGCCGACCTCATGGGATACATrcCGCT^ 

141 21 367 ATCGATACCCTGACGTQTGGCTKXSCCGACCTCATGGGATAC^^ 

139-145 cansensuB ATCGATACcCT- ACgTGcGGcOTcGCCGAcCTCATGGGATACATcCCGcOTGTaGGCOCCC 



SEQ IP 

145 
143 
144 
140 
139 
142 
141 

139-145 



ISOIATE 
DK13 
Z6 
27 
ZB 
Z4 
ZS 
Zl 

consensus 



4 28 CCGTGGGtGGCGTCGCCAGaGCCCTGGCgCATGGc GTc AGGctTcTGGAGGACGGGgTCAA 

428 CCGTGGGCGGCGTCGCCAGGGCCCTGGCaCATGGt GTrA GGGCTgTGGAGGACGGGATCAA 

428 CCGTGGGCGGCGTCGCCAGGGCCCTaGCGCATGGCGTTAGGGCT« 

428 CaGTaGGaGGCGTCGCCAGaGCCCTGGCGCATGGCGTCAGGGCTCTG^ 

428 CcGTgGGgGGCGTCGCaUX^GCtCTGGCGCATGGCGTCAGGGCTGTGG^^ 

428 CaGTaGGTGGCGTCGCaWGGGCCtTGGCGCATGGCGTCAGGGCCcTGGAGGACGGAATcJ^ 

428 CtGTgGGTGGCGTCGCCAGGGCCcTGGCGCATGGCGTCAGGGCCgTGGAGGACGGAATt^ 

CcGTgGGtGGCGTCGCCAGgGCccTgGCgCATGGcGTcAGGgCtgTGGAGGACGGgaTcAA 



g^O j;d To; 

145 
143 
144 
140 
139 
142 
141 

139-145 



DK13 
Z6 
Z7 

n 24 
Z5 
Zl 

consensus 



489 TTATGCtJkCJJSGGJArCTTCCCa G ^ 

489 TTATGCAACAGGGAATCTTCCCGG'riliC'i'C'n'i'L'l CIATCITCCT CTTG GCA CTTCTITC G 

489 TTATGCAACAGGGAACCTTCCCGGri^CiVl'l'rtTCTAT CTTC CT CTTG GCA C^^ 

489 CTATGaU^CAGGGAACCTrCCtG<JriXjLU' C'ri'f CrrCTATCTrCC^^ 

4 89 CTATGCAACAGGGAATCITCCcGGTTGCTL'rrrCTCTAT CTTC 

4 89 CTATGCAACAGGGAATCTTCL"IX;uriX3CTt:cTTtTCTAT C^ 

489 CTAcGCAACAGGGAAcCrrCCTGGTTGCTCtTreTCTATCTTtCrtCTTGCACT^ 

CTAtGCAACJtfXXSAAt CTTCCcGGTTGCrCtTTcTCTATCTTcCTC tT 



SEP ID NO: 122JLl&Z£ 



145 DK13 550 

143 Z6 550 

144 Z7 550 

140 Z8 550 
139 Z4 550 
142 Z5 550 

141 Zl 550 



139-145 consensus 



TGCCTgACTGTTCCCgCtTCGGCC 
TGCCTaACTGTTCCCaCCTCGGCC 
TGCCTgACTGTrCCCGCCTCGGCC. 
TGCCTaACcGTcCCAGCGTCtGCT 
TGCCTcACtGTtCCAGCGTCgGCT 
TGCtTGACAACACCgGCATCcGCT 
TGCcTGACAACACCaGCATCtGCc 

TGCdTgACtgttCC-gC-TCgGCc 



wo 96/05315 
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PZOUU 61 



SEP ID NO: 
153 
152 
146 
147 
148 
149 
150 
151 

146-153 



SAll 
SA6 
SA4 
SA5 
SA7 
SAl 
SA3 

SA13 

consensus 



1 ATGAGCACCSAATCCTAAACCTOUU^QAAAAACCaAAAGAAACACCAACCGCra 

1 ATGJUSCACGAATCCTAAACCTCAAAGAAAAACCcAAAGJ^AACACCAACCGCCGCCCAC^ 

1 ATGAGOUrGAATCCTAAACCnXZAAAQAAAAACCAAAAGAAACACCAArCGCCGCCCAC^ 

1 ATCRGa^:GJJ^TCCTAAACCTCAAAGAAAAACCAJUUWaiU^RCA C CAACCGCro 

1 ATGAGCACGAATCCTAiVACCTaVAAaAAAAACCAAAAGARACACCAACCGCCGCCC^^ 

1 ATGJ^GCACGAATCCTJJtf^CTCAAJtfaARAAACaUJWSAAACA C C^ 

1 ATGAGCACGJU^TCCTAAACCTCAAAGAAAAACCAAAAGAAACACO^CGCCGCCCAC^ 
ATGJWGCACGAATCCTAAACCTCAAAGAAAAACCaAAAGAAACA C CAACCgCCGCCCACa^ 



153 
152 
146 
147 
148 
149 
150 
151 

146-153 



SAll 
SA6 
SA4 
SA5 
SA7 
SAl 
SA3 
SA13 

consensus 



62 ACGTCAAGTTCCCGGGCGGTGGTCMATC^rl^3GTGGA GTTTAL^^^^ 

62 ACGTCAAGTTCCCGGGCGGTGGTCitf«ATaj"i'iTCTGGitfrrrT ^^ 

62 ACGTtJVAGTTCCCGGGCGGTGGTOUSAlVGnUt^TGC^ ^ 

62 ACGTCAAGTTCCCGGGCGGTGGTaUSATCGTTGGTGOAG™ 

62 ACgrCAAGTTCCCGGGCGGTGCTCAGATCCrr i X 3G T G GA GTTrA^^ 

62 ACGTCAAfirrcCCGGGCGGTQGTCJVGATCGCTGGTGGAG^ 

62 ACGTCAAGTTCCCGGGCGCrrGGTCJ^GAUX^tSTTGGTGGA G^ 

62 ACXrrCAAGTTCCCGGGCGGTCGTCMavrCGTTGGTGGAGT^^ 

ACGTcAACrrTCCCGGGCGGTGGTCAGATLXiritXSTCGA£rrtT3^^ 



SEO ID NO: 






153 


SAll 


123 


152 


SA6 


123 


146 


SA4 


123 


147 


SA5 


123 


148 


SA7 


123 


149 


SAl 


123 


150 


SA3 


123 


151 


SAl 3 


123 


146-153 


consensus 




SEO ID W; 


ISOLATE 




153 


SAll 


184 


152 


SA6 


184 


146 


SA4 


184 


147 


SA5 


184 


148 


SA7 


184 


149 


SAl 


184 


"150 - 


SA3 


-184 


151 


SA13 


184 


146-153 


consensus 




S^O ID NO 


ISOLATE 




153 


SAll 


245 


152 


SA6 


245 


146 


SA4 


245 


147 


SAS 


245 


148 


SA7 


245 


149 


SAl 


245 


150 


SA3 


245 


151 


SA13 


245 


146-153 


consensus 





CCCTAGarrCXXrrOTGCOCGCGACTCGGAAGACTTCAGAACGGTCG^ 

CCCTAGGTTOGGTGTGCGCGCGACTOGGAAGACTIWGAACGGTCGCAACCCCGTtS^^ 

f'nrri».ryyiTYvraTypnr«asCGACTCGGAAGACT^ 



CCCtaGgtTGGCmrroCGCGCgACTCGGAAGACTTCAGAACGGTCGC^ 



CCCGGGTACC 
CCCGGGTACC 



CGcCAGCCTATtCCCAAGGCgCGCXAacCCaCGGGcCGGTCCTGGGGTCAACCCGGGTACC 




CTTGGCCCCTrrACGCCAATGAGGGO 



rCCGAGG 



CCCCGAGG 



CTTGGCCCcTTTAcGCCAATGAGGGCCTCGgGTGGGCAGGGTGOtTGCTCTCCCCc^ 
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FIGDU 6Z 



153 
152 
146 
147 
148 
149 
150 
151 



SAll 
SAG 
SA4 
SA5 
SA7 
SAl 
SA3 

SA13 



146- 


153 


consensus 


s?o 


ID NO: 


TSOIATE 


153 




SAll 


152 




SA6 


146 




SA4 


147 




SA5 


148 




SA7 


149 




SAI 


150 




SA3 


151 




SA13 


146- 


IS 3 


consensus 


sm. 


ID NO: 


: ISOLATE 


153 




SAIX 


152 




SA6 


146 




SA4 


147 




SAS 


148 




SA7 


149 




SAl 


150 




SA3 


151 




SA13 


146; 


-153 


consensus 




ID NO 


: ISOLATE 


153 




SAll 


152 




SA6 


146 




SA4 



147 
148 
149 

150 
151 



SAS 
SA7 
SAl 
SA3 
SA13 



306 CICTCGGCCTAACTGGGGCCCaUTC^tfrCCCCGGCGAAgATCGCGCA Ar^ 

306 CrcrCGGCCTAATTGGGGCCCaUlTGACCCCCGGCGAAAATCGCGCA ATI^^ 

306 CTCTO^GCCTAATIXX^GGCCCCAATGACCCCaXSCGAAAgTCGCGCA AT^ 

306 CTCTCGGCCTAATIGGGGCCCCaUlTGACCCCCGGCGAAAaTCGCGCAATTTC 

306 CTCnrGGCCTAATTGGGGCCCCAATGACCCCCGGCGAAAGTCGCGaATTrtX;GTA 

306 CTCTCGGCCTAATTGGGGCCCCAATGACCCCCGGCGC^AGTCGCGCAATTTGGGTAAGGTC 

306 CTCTCGGCCTAgTTCXKXICCCCAAcGACCCCCGGCGGAAATCQCGCAATTTGGGTAAG^ 

306 CTCTCGGCCTA&TIGGGGCCCCAAtGACCCCCGGCGGAAATCGCGCAAcTTGGGTAAGGT^ 

CTCTCGGCCTAatTGGGGCCCCAAtGACCCCCGGCGaAaaTCGCGCAAtTTGGGtAAGGTC 



367 ATCGATACCCTAACGTGCGGATTCGCCGACCTCATGGGGTACATCCCGCTCGTAGGCGG^ 

367 ATCGATACCCTAACGTGOX^TTCGCCGACCTCATGGGGTACATCCra 

367 ATCGATACCCTAACGTGCGGATTCGCCGACCTOlTGGGaTACATCCCGCTCGTAGG^ 

367 ATCGATACCCTAACGTGCGGATTCGCCGACCTCATGGGGTACATCCCGCTCOTA^ 

367 ATCGAcACCCTAACaTGCGGATTCGCCGACCrCATGGGGTACATCCCGCTCGTAGGCGGCC 

367 ATCGATACCCTAACGTGCGGATXCGCCGACCTCATGGGGTACATCCCGCTCCT 

367 ATCGATACCCTAACGTGCGGATTCGCCXaitCTCATGGGGTACATCCCGCTCGTAGGCGG^ 

367 ATCGATACCCTgACGTGOX^TrCGCCGAcCrCATGGGGTACATCCCGCTCGTAG 

ATCGAtACCCTaACgTGCGGATTCGCCGAcCTCATGGGGTACATCCCGCTCGTAGGCGGCC 



428 CCGTTGGGGGCGTCGCAAGGGCcCTCGCACACGGTGTGAGaGcT CTrG AGGACGGGGTAAA 
428 CCGTTGGGGGCGTCGCAAGGGCtCTCGCACACGGTGTGAGGGTICTTC^ 
428 CCGTTCGGGGCGTCGCAAGGGCCCTtGCACATGGTOTGAGGGCT 
428 CCGTTGGGGGCGTCGCAAGGGCCCTCGCACATGGTGTOW^^ 

428 CCGTTGGGGGCGTCGCAAGGGCTCTCGCACACGGTGTGAGQGTT CTTG AGGACGGGGTAAA 

428 CCCnTGGGGGCGTCGCAAGGGCrCTCGCACACOGTCT 

428 CCGlTGGGGGCGTCGaUUCXKKrrCTCGCACAtGGTGTGAGGGTTC^^ 

428 CCGTTGGGGGCGTCGauUXWCTCTCGCACAcGGTGTGAGGGTcCTI^ 

CCGTTGGGGGCGTCGCAAGGGCtCTcGCACAcGGTGTGAGgQttCTTGAGGACGGGGT^ 



146-153 conaensuB 



489 CTATGCAACAGGGAATcTtCCCGU'ntiLnVrri'CU'CcAT CTTr aTCCTTGC^^ 

489 CTATGCAACAGGGAATITGCCCG U 'n' U C T C' n 'i LlC rAT CTTT gTC CTTG CACTrCTCTCG 

4 89 CTATGCAACgGGGAATITGCCCGGlUUUU'LiUlCiCTAT CTTTATCCriG CA ^ 

489 CTATGCAAC3«XKaAATTIGCCCGGrAXiC11JriTCrCTAT CTrrA '^^ ^ 

489 tTACGCAACAGGGAATgTGCCCG U ' ri ' UCILl ' X IC r C T A T CnTA TC CTTG CA CrrCTCTC G 

489 CTACGCaUUZAGGGAATTTGCCCG UTlUCrLTn CX CTAT CTITA TC CTIG CA CrrCnTC c 

489 CTACGCAACAGGQAATTTACCCGGrilrLaLi 1 ICTCrAT CTTTA TC CTTG CA LTl'LVn'C A 

489 CTAtGCAAdiGGGAATrTACCCGGTTGCTLUTl'LTCTA 

cTAt GCAACaGGGAATtTgCCCGGrilaLl L i i 1 L 1 CtATCTTTaTCCTTGCAwii u i CTCg 



153 SAll 

152 SA6 

146 SA4 

147 SAS 

148 SA7 

149 SAl 

150 SA3 

151 SA13 

146-153 consensus 



550 TGCtTgACCGTCCCgOCCaCTGCA 
550 TGCCTaACCGTCCCtGCCTCTGCA 
550 TGCCTGACCGTCCCgGCCTCTGCA 
550 TGCtTGACCGTCCCACCCTCTGCA 
550 TGCCTGACCGTCCCAGCCTCcGCA 
550 TGtCTGAtCaTCCCGGCCrCTGCA 
550 TGCCTGACCCTCCCGGCCTCTGCA 
550 TGCCTGACtGTCCCGaCCTCTGCc 

TGccTgAccgTCCCggCCtCtGCa 
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SEO IP l^Q; 






103-154 


cons . 


1 


103-124 


1 


1 


125-134 


2 


1 


135-138 


3 


1 


139-145 


4 


1 


146-153 


5 


1 


154 


6 


1 








103-154 


cons. 


62 


103-124 


1 


62 


125-134 


2 


62 


135-138 


3 


62 


139-145 


4 


62 


146-153 


5 


62 


154 


6 


62 



ATGAGCACgaaTCCtAAftCCtCAAAGAaAaACC&aAcGtAAcACcAaCCgcCGCCC&cagO 

ATGAGCACgAaTCCTAAACCTCAAAGAaAaACCAAi^GTAACACCAaCCQcCGCCC&C^ 

ATGJUSCAOUlaTCCtAAACCtCiVAAGAAAAACCAaAAGAAAcACaAACCQcCGCC 

ATGAGCACACTTCCTiUUlCCrCAAAGAAAAACaUJUUSJUU^ 

ATGAGCACgAATCCTAAACCTCAAAGAAAAA C CAAACGTAACACCAACCGcCXSCCCcATG^ 
ATGAGCACGAATCCTAAACCTCAAAGAAAAACCaAAAGAAACACCAACCgCCGCCCACAGG 
ATGAGOlCACrrTCCAAAACCCCAAAGAAAAACCAAAAGAAACACaUUrCGTCGCCCAA^ 



62 AcgTcAAgTTcCCgGGcGGtGGtCACATCGTtGGtCKlAGTtTActTGtTGCCGCGCAGGGG 



SEQ. 


ID NO: 


GenotVDc 




103- 


154 


cons. 


123 


103- 


124 


1 


123 


125- 


134 


2 


123 


135- 


138 


3 


123 


139- 


145 


4 


123 


146- 


153 


5 


123 


154 




6 


123 


5TO 


XP TO; 


Genotvoe 




103- 


154 


cons. 


184 



123 CCCcaGgtTGGGTGTGCGCgCgaCtaOgAAgaCTTCcGAgCGgTCgCAaCCtcGtGGaaGg 



103-124 
125-134 
135-138 
139-145 
146-153 
154 



184 CGaCAgCCtATcCCcaAgGctCGcCggcccgagGGcaggtcCTGGQctcagCCcGQgtAcC 

\ 184 CGaCAaCCTATCCCCAAGGC tCGcCggCCOSAGGGcAGGgCCTtXXSCtCAGCCcGGGtA^ 

184 CGCCAGCCCATCCCgAAAaATCGGCGCtCCACtGGauUStCCTGGGGAAaaCCaGGATAtC 

1 84 CGACAGCCTATCCCCAAGGCGCGTCGGAGCGAAGGCCGgTCCTGGGCTCAGCCcGGG^ 

184 CGtCAgCCTATCCCCaAGGCaCGtCggCCCGitfXKScAGgTCCTGGGCtCAgCCcGGGTAcC 

184 CGcOUSCCTATtCCCAAGGCgCCCCAacCCaCGGGcCGGTCCTGGGGTCAACCC^^ 

184 CGCCAACCTATACCAAftGGCGCGCCMCCCCJVGGGCAGGCACT^ 



5EQ ID NO! Genotype 

103-154 cons. 245 CtTGOCCccTCTAtGgcaAtGAgGGcttcGggTGGGCaGGaTGGcTccTgTCcCCcCgcGG 

103-124 1 245 CtTGGCCCCTCTAtGgCaAtGAGGGCt tgGGgTGGGCaGGATGGCI'CCTJTCaCGCCgt GG 

135-138 3 245 CTTGGCCCCTCTATGGTAAcGAGGGCTGCGGGTGGGCAGGgTGGCirCTCT 

139-145 4 245 CtTGGCCtCTtTAcGQcAAtaAgGGcTGcGGGTGGGCaGGgTGGCTCcTGTCcCCcCGcGG 

146-153 5 245 CirGGCCCcTTTtoGCCAOTGAGGG^^^ 

SEP ID NO: Genotype 

1 03 - 153 cons . 306 cTCt cggCCtagtTGGGGcCccActGAcCCCCGGCgtaggTCgCGcAAttTGGGtAagGTC 

103-124 1 306 cTCtCGGCCTAgtTGGGGCCCcAcaGACCCCCGGCGtJUXrrCGCGtAATtTGGGt^^ 

125-134 2 306 tTCtCgcCCttctTGGGGCCCCActGAcCCCCGGCAtAgaTCgOGcA ActTG GGtAagGTC 

135-138 3 306 CTCCCGTCCATCTTGGGGCCCAAAcGACCCCCGGCGgaGGTCCCGCRATTTGGGTAAaC^ 

139-145 4 306 cTCtcGgCCgTCTTGGGGcCcaAATGATCCCCGGCGgAGgTCcCGcAAttTGGGTAAgGTC 

14 6-153 5 306 CTCTCGGCCTAatTGGGGCCCCAAtGACCCCCGGCGaAaaTCGCGCAAtTTGGGtJU^ 

154 6 306 CTCCCGGCCACATTGGGGCCCCAATGACCCCCGGCGTCGATCCCGGAATTrGGGTAAGCTC 
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5EQ ID NO: 
103-154 

103-124 
125-134 
135-138 
139-145 
146-153 
154 



Genotype 
cone. 

1 
2 
3 
4 
5 
6 



367 ATCGAtACccTcACgTGcgGctTcGCCGAcCTCATGGGgTACATcCCgc^ 

367 ATCGAtACCCTcJ^aTGCGGCTTcGCCGACCIOVTGGGGTACATt CCGCTCG^^ 

367 ATCGAtACCCTaACgTGcgGt tTTGCCGACCItJlTGGGgTACaTcCCcaTOT 

367 ATCGATACCCTtACGTGCGGaTTCGCCGACCTCATOGGGTACATCCCGCTCGTC 

367 ATCGATACcCTgACgTGcGGcTTcGCCOAcCTCATGGGATACATcCCGcTCGTaGGCGCCC 

367 ATCGAtACCCTaACgTGOWATrcGCCGAcCTCATGGGGTACATCCCGCTC 

367 ATCGATACCCTAACGTGTGGGTTCGCCGATCTCATGGGGTJ^ 



SEP ID NO: 
103-154 



103-124 
125-134 
135-138 
139-145 
146-153 
154 



SEP ID NO: ggTOtVPfi 

103-154 cons. 



103*124 
125-134 
135-138 
139-145 
146-153 
154 



Genotype 

rong 428 CcgTaGGgGGcGtcGCcaggGCccTgGCgCAtGGcGTcaGggttcTgGAgGACGGggTgAA 

1 428 CcCTaGGgGGcGCTGCCACHrGCccTGGCgCAtGGcGTCCGgGTtCTGGAgG^ 

2 428 CaQTtGGaGGcGTcGCCAGJ^SCtCrgGCaOltGGtCrrgAGgGTc CTGG AgGACGGgaTaAA 

3 428 Cc<yrAGGaGGCGTCGCAAGAGCCCrrCaCGCATGGCGTaAGGGCCC^^ 

4 428 CcGTgGGtGGCGTCGCC»3gOCccTgGCgCATtK3cGTcAGGgctgTO^ 

5 428 CCGTTGGGGGCGTCGOUWKKSCtCTcGCACAcGGTGTGAGgGttCT^^ 

6 428 CITTCGGCGGCGTCGCGGCTGCGCTCGCACATGGC^^ 

489 cTatGCAACaGGgAAttTgCCcGGlTGCtCtTrcrrCtATcOTCCrrcCTgGCtCTgCT 

1 489 cTAtGCAACAGGGAAtcTgCCcGGTTGCtCtTTCTC»TCTTCCTCtTgGC^ 

2 489 tTAtGCAACaGGgAAttTgCCtGGlTGCTCtriX'XCTATcTTCtTgtfltOCcCTt^GTC 

3 489 TrrcOCAACAGGGAACTroCCCGGrrGCTCcTTrrC^ 

4 489 cTAtGCAACAGGGAAtCTTCCcKX S rr G CT CtTT cffCTAT CITc CT^ 

5 489 gTAtGCAACaGGGAATtTgCCCGUrmCTt i'l'i U'l C tATCTCT aTCCTT GCACTTCT eTCg 

6 489 TTATGCAACAGGGAATCTCCCCG GnnT;Cl V lVl 'CTCT ATLTIt.L ' m 



SEP ID WQ: Genocype 
103-154 cons. 



103-124 
125-134 
135-138 
139-145 
146-153 
154 



550 TGcctgaccgtcCCagcttCtgct 

1 550 TGttTgACcatcCCaGctTCcGCt 

2 550 TGCatCaCagtgCCaGtgTCtGCt 

3 550 TGCtTAaTTCATCCaGCAGCTAGT 

4 550 TGCcTgACtgt'tCCagCgTCgGCc 

5 550 TGccTgAccgTCCCggCCtCtGCa 

6 550 TGCCTCACAACCCCAGCTTCGGCT 
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1^ 



wo 96/05315 



PCT/US95/10398 



76/89 
FXGDU 7A 



157 
158 
159 
160 
155 



S14 
SWl 
S18 
DR4 
DK7 



155*160 consensus 



1 HSTOPKPQRJCTKIUmnUlPQXIVKFPGGGQZVGGVYIJ^ 
1 MSTRPKPQRKIiaumnUlPQDVKFPGGGQrvrGGVYIXPRRG^ 
1 KSTyPKPQR K T KRN TtnUtfQDVKPPGGGQIVGGVYIJiPIUtG 
1 MSTKPKPQIUCTKIUrnniRPODVKFPGGGQIVGGVYU^ 
1 KSTNPKPQRKTKIUrrNlUlPQDVKFPGGGQIVGGVYLLPIU^ 
1 HSTOTKPQRKTKRimnUlPQDVICPPGGGQIVGGVVLLP 

MSTOTKPQRKTKROTWUlPQOVKFPGGGQrWGGVYliPWWS 



SBg IP yQ: itSQ^TB 

157 
158 
159 
160 
155 



11 
S14 
SWl 
S18 
DR4 
DK7 



62 RQPIPKAIUU>E6imiAQPGYPWPLYGNBGCGWAGWU^PRGSIU>S«GPTDPIUUtSIUn^ 
62 RQPIPKAMPEGRTKAQPGYPWPLYGNEGCGMAGKU^PRGSRPSHGPTDPRIUtSI^^ 
62 RQPIPKARRPSGRTKAQPGYPWPLYGNEGCGMAGHLLSPRGSRPSWGPTDPRRRSRHLGKV 
62 RQPIPKARRPBGRTKAQPGYPttPLYGIIEGCGWAGWZJiSPRGSRPSWGPTDPRRRSRm/^ 
62 RQPIPKARRPEGRTKAQPGYPWPLYGHEGCGHAGKLLSPRGSRPSWGPTDPRRRSRNL6KV 
62 RQPIPKARRPBGRTW^QPGYPWPLYGKEGCGWAGWIJiSPRGSRPSHGPTDPRRRSRNXiGKV 



155-160 



consensus 



RQPXPKARRPBGRTW1^3PGYPWPI.YGNEGCGWAGWIJ^PRG5RPSWGPTI>PRRRSRNLGKV 



SEP ID WO: 

156 

157 

158 

159 

160 

155 



USll 
S14 
SWl 
S16 
DR4 
DK7 



123 IDTLTCGFiU)UCYIPLVGAPU;GAARALAKGVRVI£DGVNY^ 

123 IDTLTCGFADUCYIPLVGI^XiGGAARAIiAHGVRVIjnxrniYATGRLPGCSFSIFX^^ 
123 IDTLTCGFJU)IiCYZPLVGAPI«GGAARAUU{GVRVLEIXrn!nrATGN^^ 
123 IX)TLTCGFiU)LlS3YZPLVQAPI/X;AARAIJWGVRVLEIX;V^ 
123 IjyrLTCGTJ^USOriin'VGMUSQlMJ^^ 

123 IDTLTCGFADIiaSYIPLVGAPUXSJUWlliAHGVRVLEDGVNYATGt^ 



155-160 



consensus 



IDTLTCGFADLHSYIPLVGAPLGGAARAIJUiGVRVI^GVtrYATGmiPGCSFSIF 



gEQ WO: 
156 
157 
158 
159 
160 
155 

155-160 



USll 
S14 
SWl 
SI 8 
DR4 
DK7 

consensus 



184 CLTVPASA 

184 CLTVPASA 

184 CLTVPASA 

184 CLTVPASA 

184 CLTVPASA 

184 CLTVPASA 

CLTVPASA 



wo 96/05315 



PCTAJS95/10398 



11/B9 



?ZG0IIS 7B 



170 
162 
171 
163 
165 
169 
164 
166 
167 
168 
161 
174 
172 
176 
173 



IP TOi 



^^^1 1 MSTtPKPQRKITCIOTBRRPODVKFPGCraQIVGGVnXPW^ 

IIID8 1 MSTNPKPQRJCnaurnnmPODVKFPGGGQIVGGV^ 

S45 1 MSTNPKPQRtfTKRmWRRPQDVlCPPGGGQIVGGVniLPRWjPRI^^ 

S9 1 MSTin»KPQRraaaiTNRRPODVKPPGGQQIVGG\nfIiP!^ 

Dl 1 MSTNPKPQWCriaanWRRPQDVKFPGGGQIVGGVniPIWGPI^^ 

PIO 1 MSTOPKPQRKTKMmniRPQDVKFPGGGQIVGGVyiiPRRGPHLGn;^ 

IND3 1 MSTNPKPQRICrKRNTiniRPQDVKFPGGGQlVOavm*PRROPI^^ 

US6 1 MSTNPKPQWCTKRNraRWQDVKFPGGGOIV^^ 

DKl 1 MSrriPKPORlCnaumiRlUiQDVKFPGGGOIVt^^ 

TIO 1 MSTOTKPQWCnOOTNIUlPQDVKPPGGGQIVGGVyX^ 

SW2 1 MSraPKPQRJCrKRNTNWtfQDVKFPGGGQIVGGVYL^ 

SAIO 1 HSTIIPKPQRlCrXimimtPQOVKFPGQGQZVGGVmJR^ 

HK4 1 HSTNPKPQRXTKRinVRRPQDVKFPGGGQIVGGVyiXPX^ 

HK3 1 MSTOTKPQRlcrKRHTNIUlPODVKFPGGGOIVGGVmiPW^ 

T3 1 MSTNPKPQWCnawmnWPQtJVKFPGGGQIVGGVYUiPRIW 

HK5 1 MSTNPKPQRICnCRimniRPQDVKFPGGGQIVGGVYU-PRTO 



161-176 consenBUB 



Iff 

170 
162 
171 
163 
165 
169 
164 
166 
167 
166 
161 
174 
172 
176 
173 



IND8 
S45 
S9 
Dl 
PIO 

IHD3 
OS 6 
DKl 
TIO 
SH2 

SAIO 
HK4 
HK3 
T3 
HK5 



62 
62 
62 



62 RQPIPKARRPEGIUlW2^PGHPWPLYa£IS6IX;UAGWXJ^PSGSRPSWGPTO 
62 RQPIPKARRPEGMWAQPGHPWPI.YGE7BGU;KAG1fLLSPnGSI^STO 
62 ROPZPKAlUlPBGIUWAQPGHPVPLYGRCGLGWAaWLLSPM^ 
62 ROPZPKAimPEGIUlWAQPGyPVPLYQREGLGHIUmXSPRGSRPSirciPnDPR^ 
" R0PIPKAIUIPEGRAKAQP0YPVPLY6I7EGI/;KA6HXJ^PR6SRPS1^ 
ROPIPKARRPSGiaHAOPGYPWPLYGKSGLGWAGWIJ^PRaSRPSHG 
RSpiPlCAlUlPEGIUlUAQPGyPWPLYORBGXJGHAGWXiLSPROSRPSm 
62 R0PZP3CAIUVEGRAWAQPGYPWPLYGI1EGHGWAGNXJ^P]U3SRPSWGPTDPRI»SR^^ 
62 R0PIPKAIUlPE6IU!a0PGYPWPLyGNEaM0WAGNU«SPRGSRPSWGPnD 
62 R0PIPKARQPBGHAWA0PGYPWPLYGNEGMGWAGH1J.SPRGSRPSWGPTDPRIWSRNLGKV 
62 ROPIPKARQPEGRAWJ^PGYPWPLYGHBGMGWtfSWU^PRGSRPSWGPTDPRIUlSRMMI^ 
62 RQPIPlSRSpEGPTM3^PGYPWPLyGNSGlGMA6ia<I.SPRGSRPSWGPTD 
62 RQPIPKAROPSGRTHAQPGYPIiPLYQNBCasmAGiaiLSPRGSRPSWGPn 
62 R0PIPKAROPEGRTHAQPGYPWPI.YGHS(afi;KA6HXJ4SPRGSRPNWGPTDP 
62 RQPIPKARRPEGRAWlQPGYPWPLYGdSCaS^HAGWU^PRGSRPNWGPTDPRRRSRI^ 
62 RQPIPKARRPSGRtWlQPGYPWPLYGnEGtfi?iaiSfa.LSPhGSRPBWGPTDPRRRSRXII/3 



161-176 



n 

170 
162 
171 
163 
165 
169 
164 
166 
167 
168 
161 
174 
172 
176 
173 



consensus RQPiPiCARrPEGRaHAQPGyPWPLYgnSG-GWAGWLLSPrGSRPBHGPtiDPRRRSRNLGKV 

^^^^^ J23 limTCGFADLICTIPLVGgPIXKSvARAUWGVRWEDGVHYAT^ 

IND8 123 IDTLTCGFJtf>IlCTIPLVGAPLGGJUWaJ W gVRVLEDGVWYATGWLPGCSFS I FLLALLS 

S45 123 IDTXTtrGFADIlCTIPLVGAPUKaAARAIJtflGVRVLSDGVirf ATGNLPGCSFS i n 

S9 123 IDTLTCGFADIilGYIPLVGAPLGGAARAIJUiGVRVIJEDGVNYATG 

Dl 123 IDTLTCGFADIJCTIPLVGAPLGGAARAIiAHGVRVLSDGVOTATGt^ IFTJiATiTiS 

PIO 123 ZDTLTCGFADUXjYIPLVGAPLGGAARAXJUiGVSVLBDGVNYATGmiPGC^ 

IND3 123 ZDTLTCGFADIMH'ZPLVGAPLGaAARALAHGVRVLEDGVNYATGNLPGCSFSZFLX^^ 

US6 123 ZDTLTCGFA0L1CYZPLVQAPUK3AARAIAHGVRVLBDGVNYATGMLPGCSFSZFU 

DKl 123 ZDTLTCGFADLHSYZPLVGAPZ/K3AARAIJUI6VRVI£DGVNYATGm.P 

TIO 123 ZPTLTCGFADUKSYZ PLVGAPIXK3AARALAHGVRVLEDGVNYATGHLPGCSFS ZFTJ AT JiS 

SW2 123 IDTLTCGFADLMGYZPtVGAPIXXSAARALAHGVRVLSDGVIIYATGM^ 

SAIO 123 ZDTXTCGFADZiCjYZPLVGAPLGGAARAIAKGVRVLEDGVNYAT^I^ 

HK4 123 ZDTLTCGFADIiXnTPLVGAPIX;GVARAIJUIGVPVvEDGVinrATGNLPGC5^ 

HK3 123 ZDTLTCGFADIiCYZPLVGWLGGVARAIJtf{GVRVLEDGVWYATGmJGCSFSZFT.TAT.TJ> 

T3 123 ZDTLTCGFADI2!GYZPLVGAPIiGGVARAIiAH6VRVLEDGVNYATGNX«PGCS 

HK5 123 IDTLTCGFADIiCYZPLVGAPLGGVARAIJaiGVRVLSDGVNYATGNiPGCSFSZFIiL^^ 



161-176 consensus 



ZDTLTCGFADI«GYIPLVGaPLGGaARAIJ«GVRVlBDGVNYATG»lPGCsFSlFlXAia*^ 



S^Q TP ^P; 

175 
170 
162 



XSQLAT 



IKD8 
S45 



184 CLTIPASA 
184 CLTvPASA 
184 CLTIPASA 



wo 96/05315 



PCT/US95/10398 



78/89 
FIGORS 7B 



171 S9 184 CLTIPASA 

163 01 184 CLTIPASA 

165 PIO 184 CLTIPASA 
169 im>3 184 CLTIPASA 

164 0S6 184 CLTIPASA 

166 DKl 184 CLTIPASA 

167 TIO 184 CLTIPASA 

168 SW2 184 CLTIPASA 
161 SAIO 184 CLTIPASA 
174 HK4 184 CLTIPASA 

172 HK3 184 CLTtPASA 
176 T3 184 CLTIPASA 

173 HKS 184 CLTtPvSA 

161-176 consenauB CLTiPaSA 



/'A 



X 



wo 96/05315 



PCTAJS1W10398 



79/89 



FIGUU 7C 



m 

176 
172 
174 
161 
168 
167 
166 
164 
169 
165 
163 
156 
157 
158 
159 
160 
155 
170 
162 
171 
175 



T3 
HK3 
»K4 
SAID 
SW2 
TIO 
DKl 

as6 

IEID3 
PIO 
Dl 
OSll 
S14 
SWl 
518 
DR4 
DK7 
IND8 
S45 
59 
PB 



155-176 consensus 



SEP ID NO: 

173 

176 

172 

174 

161 

168 

167 

166 

164 

169 

165 

163 

156 

157 

158 

159 

160 

155 

170 

162 

171 

175 



HK5 
T3 
HK3 
HK4 
SAIO 
SW2 
TIO 
DKI 
US6 
IMD3 
PIO 
Dl 
USll 
A S14 
/A SWl 
S18 
DR4 
DK7 
IND8 
S45 
39 
P8 



li?tPlSQRIOTSNSRSQ 

MSTnPKPQWcTKlOTnRRPQDVKFPGGGQIVGGVYIJ^RM 



62 
62 
62 
62 
62 
62 
62 
62 
63 
62 
62 
62 
62 
62 
62 
62 
62 
62 
62 
62 
62 
62 



mPIPKAR»BGRAm^ 

HnPIPKMSsGin^PGYPWPLYGREGCCnaGWlJ^ 



155-17 6 consensus 



RnPlPKARhPEGRAWAOPC^r^^ 
RQPIPlStePEGSwAQPG^W 

RQPIPKJ«rPEGRaKAQPGyPWPLYgnEG-GWAaWU*SPrGSRPsWGPtDPRHRSiaM 



g^^ IP W ; 

176 
172 
174 
161 
168 
167 
166 
164 
169 
165 
163 
156 
157 
158 



T3 
HK3 
HK4 
SAIO 
SH2 
TIO 
DKl 
US6 
IND3 
PIO 
Dl 
USll 
S14 
SWl 



123 IDTLTCGFADUffiYIPLVGAPLGGVARAIJUiGVRVLBDGVOT 

ill idSSjSS^iplvSpl^^ 

123 iDSTTCF^lJ«5YIPLVGAPLQGAARAIJUiGVRNrL^^ 

123 IdStcGFADLMCTIPLVG^ 

123 IdStcGFADiSctIPLVGAPI^^ 

^23 IdSwFMI^IPI^VGAPiSg^ 

123 IdStcGFMiSotIPLV^MGA^ 

123 iBStCTFMI^iSvGAPiSg^ 

^23 iBStotfadiSyiplvqapi^^ 
III idStcgfmiSbyiplvgapi^^ 
133 5SStcgfadi^iplvgSix^^ 

123 IdStCGFADI^IPLVGAP 



wo 96/05315 



PCTAJS95/10398 



80/89 



7Z0DU 7C 



159 
1€0 
155 
170 
162 
171 
175 



S18 
DR4 
DK7 
IND8 
S45 
S9 
P8 



155-176 consensus 



123 TTrTLTCGFMIiiGYIPLVGAPliGGAAlUaJWGVRVLBIXnWYATG 
123 IDTLTCGFMUCYXPLVGAPIiGGAAIUIJWGVRV^^ 

123 iDTLTCGFADXMOYIPLVGAPLGGAAIUaiAHGVRVLBDCr/KyATGmiPGCSPSZFI^^ 
123 imTCQTJ^USGrin^JGl^lJSQ^ 

123 IDTLTCGFJU)U«5yiPLVGAPL0GAARAIJUlGVRVIXDGTOyA 
123 zm^rcSFI^USSYlVhVt^ 

123 IDTLTCGFMLMCTIPLVGgPLGGvWUUiAHGVRVVEDG^ 

IDTLTCGFjUJUSGYIPLVGaPLGGaASlWJUIGVRVlETCVNYATCT 



SEQ IP W\ 

173 
176 
172 
174 
161 
168 
167 
166 
164 
169 
165 
163 
156 
157 
158 
159 
160 
155 
170 
162 
171 
175 

155-176 



ISOLATE 

" hkS 

T3 
HK3 
KK4 

SAIO 
SW2 
TIO 
DKl 
US6 

ZND3 
PIO 
Dl 

USll 
S14 
SWl 
S18 
DR4 
DK7 

zms 

S45 
S9 
P8 

conBenBUB 



184 CLTtPvSA 
184 CXTIPASA 
184 CLTtPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTVPASA 
184 CLTVPASA 
184 CLTVPASA 
184 CLTVPASA 
184 CLTVPASA 
164 CLTVPASA 
184 CLTVPASA 
184 CLTZPASA 
184 CLTZPASA 
184 CLTZPASA 

CLTiPaSA 



wo 96/05315 



PCT/US95/10398 



81/89 
FZGOSUB 7D 



m 

17B 
180 
177 



177-180 



T9 

nsio 

T4 

consensuB 



1 MSTNPKPORlCriRimnWPQDVKPPGGGQlVGGVYLIJ»lWGPiaiG^ 
1 MSTNPKPORKTiaamnUlPQDVKFPGGGQIVGGVyLl^RRGPRI/3^^ 

1 MSTnPKPQRlCriaOTinUlPQDVKPTGGGOIVGGV^^ 

MSTnPKPQWCTWOTTNRRPQDVKFPGGGQIVGGVYIJ-PRRGPIUiC^^ 



g^Q IP 

178 
160 
177 

177-180 



T9 
OSIO 
T2 
T4 



consensus 



IP TO; 1 



178 
180 
177 

177-180 



USIO 
T2 
T4 



consensus 



SBP ?:p To; 

179 
178 
180 
177 

177-180 



T9 
USIO 
T2 
T4 

consensus 



62 I^PlpmRRSTGKSWGKPGYPWPLYGNEGLGWWSWlXSPWSSRPSWG^ 

62 ^PIPIOJIWSTGKSWGKPGYPWPLYGNEGlJGWAGWIXSPWSSRPSWGPHDPWIRSRll^ 

RQP I PJaJRRsTGKSWQKPGYPWPLYGNEGLGWJWGWIiS PiaSSRPSWGPnDPRHRSR^^ 

123 IDTLTCOPADI14GyiPV\«SAPIiGGVARMJa«GVRVLBO^^ 
123 IDTLTCGFJU3I1CYIPVVGAPI/X;VARAIJ^GVRVLSIXjVN^^ 
123 iDTLTCGFADlMGYIPVVGAPLGGVAWUJUiaVRVliBIXOTiyA^ 
IDTLTCs Ua)I2fflyvPVVGgPLGGVARAIJ«GVRVLBIXjVHYATC 

IDTLTCgf ADLMGYi PVVGaPLGGVARAIJVHGTO\n-EDOVOTATGMIiPGCSFSIFLI^^ 



123 



184 CITtPaSA 

184 crrzpvsA 

184 CXTZFVSA 
184 CZTIPVSA 

CITiPvSA 



wo 96/05315 PCTAJS95/10398 



82/89 



7ZabR8 78 



m 

184 

18X 
182 
185 



m 

SW3 
T8 
USX 
0K8 



181-165 consensus 



184 
181 
182 
185 



IP TO; 



.1 
SW3 
T8 

nsi 

DK8 




181-185 consensus 



MSTNPKPQRlCnCRimniRPQDVKFPGGGOIVGGVyiXP 



62 R0PIPia)WlSTOlCpWGKPGYPWPLYGIIEGCQWAGWU*SPI«5SHPNWG^ 

11 SoPIProRRSTOlSwGlSGYPWPLyGNBG 

62 ROPIpSRRSTCKSWGKPCmWLYGNEGC 

62 MPIProRMTCJSwaKPGYPWPLYGNBGCCWA^ 

62 RQPIPCTfiSsTClSMGKPGyPWPLYGREGCCJW^ 

RQPIPKDRRSTGKsWGKPGYPWPLYGNEGCGWAGWLLSPRGSrPtWG^ 



m 

184 
181 
182 
185 



ID NO: ISSl 



.1 
SW3 
T8 

nsi 

DK8 



181-185 consensus 



123 IDTITCGFADLMGYIPVVGAPVGGVMUILAHGVRVI^^ 
123 IDTITCGF^iSyIPVVG^^ 

123 iDnT^FADI^IP^AraA^ 
123 idTITCQFMiSyIPW^ 

IiyriTCGFMI^tTYIPVVaAPVGGVWUaJUIGVRV^^ 



fign ID WO: 

183 

164 

181 

182 

185 



m 

SW3 
T8 
USl 
DK6 



181-185 consensus 



184 CCTVPVSA 
184 CPTVPVSA 
184 CPTVPVSA 
184 CaTVPVSA 
184 CcTVFVSA 

C-TVPVSA 



utnj 



wo 96/05315 PCTAJS95/10398 



83/89 
PIGURB 7F 



ID NO: 



184 

181 
182 
165 
186 
178 
180 
179 
177 



177-186 



SW3 
T8 
USl 
DK8 
S83 
USIO 
T2 
T9 
T4 

coneensus 



MSTNPKPQWCTKJOTNRRP^ 

MSTOTKPQWmOUmmRPODVKFPGGGQIVGGVYIiPRWJPIU^ 
MSTiPKP0RICrKMITOIUU>QDVKFPGGGQIVGGVyiJ*PRI«5PiaGV^ 
MSTNPKP5Ria'iRHTinUlPQDVKPPGGG0IVGG\nrLLPRRGPRIiG^ 
MSTKPKPQRKTJtWmJRRPQDVKFPGGGQIVGGVnJJ 

MSTnPKPOWCntWmnWPODVKFPGGGQIVGGV^^ 



SEQ IP TO? 

183 
184 
181 
182 
185 
186 
178 
180 
179 
177 

177-186 



OKll 
SH3 
T8 

nsi 

DK& 

se3 
osio 

T2 
T9 
T4 

consensus 



62 ROPIPKDWlSTGKpWGKPGYPWPLyGNBGCGWMWLI^PRGSHPNWGP^ 

62 RQPIPKDRRSTQlSwGKPGyPWPLYGNBGCCWASWLLSPRGSHPlWGP^ 

62 ROPIProRRSTGKSWGKPGYPWPLYGIIEGCGWAGWLLSPWjSRPTWGPTDPWn^ 

62 ROPIPKDWlSTGKSWGKPGyPWPLYaNEGaSWAGWLLSPRGSRPTWGPTDPI^^ 

62 R0PIPia)RRSTGKSWGKPGYPWPLYGNEGCGW3«»WIiSPWJSRPTWGPTDPRHRSI^ 

62 ROPIPKDRRtTCKSWOrPGYPWPLYGRBGLGWAOWLI^PWSSRPSWGPTDPRHycSR 

62 ROPIPraRRpTGKStfGKPOYPWPLYGNEGIXnfAGWLI^PRGSRPSWGPTDPRHRSRKVGKV 

62 ROPIPrawSTGKSWGKPGYPWPLYQNBGMWAGinXSPRGSRPSWGPxiDPRHRSRI^ 

62 ROPIPia>RRSTGKSWGiaPayPWPLYGNBGlXjWAGWLI^PRaSRPSWGPBDPRHRSRinroiCV 

62 RQPIPKDRRSTGKSWQKPGYPWPLYGNBGUSWAGWLLSPRGSRPSWGPnDPRHRSRri^ 

RQPI PRDRRsTGKsWOtPGyPWPLYGNBO • GMAGWLLS PRGSr PsWGP tDPSWrSRHlGJcV 



!^ 

184 
181 
182 
185 
186 
178 
180 
179 
177 



177-186 



SW3 
T8 
USl 
DKS 
SB3 
USIO 
T2 
T9 
T4 



consensus 



SEP ID NO: 
183 
184 
181 
182 
185 
186 
178 
180 
179 
177 

177-186 



ISOLATE 
DKll 
SW3 
T8 
USl 
DK8 
SB3 
USIO 
T2 
T9 
T4 

consensus 



123 IDTITCGFJU5UffiYIPVVG&PVGGVARMiAHGVPVIJnX5INYATGm 

123 IDTITCGFADIifijYI PVVGAPVGGVaRALAHGVKVLBDGIWYATGNLPGCSFSIFT.TiATiTi^ 

123 IDTITCGFADLHDYIPV\raAPVGGVARAlJa!GVRVLEIXjINyATGOT*PG 

123 IDTITCGFADIi«;YIPVVGAPVGGVARAIJUiGVRVIJBDGINyAT6NlJGCSFSIFIJ^^ 

123 ZOTZTCGFADX^CYZPVVGAPVGGVARAIJUiG\niV^ 

123 IDTLTCGFADIlCTIPVVGAPVGGVARAIjaiGVRVIJTOINyATGl^^ 

123 IDTLTCGFADIlCTIPVVSAPUXJVARAIJ«GVltVLBIXr/^ 

123 IDTLTCGFADIACTIPVVGAPLGGVARAIJtflGVRVLSDGV^ 

123 IDTLTCGFADUlGYIPVVGAPIXXjVARAIJUKSVRVIifil^^ 

123 Xim«TC8lADU^vPVVGgPLGGVARAUUIGVRVLSDGVNYATGN^ 

Iirr - TCgCADl^YiPVTOaPvGGVARAlJUfGVRVLEDGiHYATGmJP 



184 CcTVPVSA 
184 CFTVPVSA 
184 CFTVPVSA 
184 CaTVPVSA 
164 CCTVPVSA 
184 CISVPVSA 
184 CITIPVSA 
184 CITIPVSA 
184 CITtPaSA 
184 CITiPvSA 

CitvPvSA 
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187 
190 
18B 

187-190 



if 



HKIO 
DK12 
S52 

consensus 



1 MSTLPia>ORJCTKROTIHRPQDiKFPGGGQrVGGVYVLPRI^^ 
1 MSn!piSQIUCriaOTIWU»QDVKFPGGGQ 

1 MSTLPKPQR3ClTCRm'IRRPQDVKFPGGGQIVGGWn:iPIUlGPR^ 
MSTLPKPQRKTKRimRRPQDvKFPGGGQIVGGVyVLPraGPRL^^ 



187 
190 
186 



IP yP: 



S2 
HKIO 
DK12 
SS2 



187-190 consensus 



62 ROPIPKARRSEGRSWADPQYPWPLYGNBGCGWAGWU*SPRGSRPSWGPNDPRWlSWn^ 
6 2 R0PZPKAIU^EGI^^PGYPWPLYGNEGCGWJ^WLX^PIU;SRPSWGPin3PU 
62 ROPIPKARRSEGRSTOQPGYPWPLYGNBGCGWAGWIJ^PRGSRPSWGPNDPRI^ 

62 rqpipkarSegrswaqpgypwplyqnbgcgwagwu^pto 

rqpipkarrsegrswaqpgypwplygnbgcgwagwu*sprgsrpswgpiroprwsrhu;kv 



gS9 IP TO, JSgLfiTS iDTLTCGPMMGYIPLVGAPVGGVWUaiAHGVRALBDGtMFAT^^ 

HKIO 123 ir>TLTCGFADI^IGYIPt.VGAPVGGVRRAIiAH0VHALEDGI»FAT6m*PGCSFSlF^^ 

DK12 123 IDTLTCGPADl^YIPLVGAPVGGVARAIJUlGVRiajnXSINPATGNl^ 

S52 123 IDt£tcGFADI^YIPLVGAPVGGVARAIJUIGVRA1^ 

consensus iDXLTCGFADIifiSYlPLVGAPVGGVARAtJUiGVRALEDGIKFATGR^ 



189 
187 
190 
188 

187-190 



SEP ID NO: 

187 
190 
188 

187-190 



HKIO 
DK12 
S52 

consensus 



184 CLZHPAAS 
184 CLIHPAAS 
184 CLIHPAAS 
164 CLvHPAAS 

CLIHPAAS 



wo 96/05315 



PCT/US95/lp398 



85/89 



FZGUU 7R 



194 
193 
192 
195 
196 
191 
197 

191-197 



ISQIATE 
Z5 
Zl 
Zd 
Z6 
Z7 
Z4 
DK13 

consensuB 



1 MSTOTKPORKTKRimnUlPMDVKFPGGGQIVGGVyiiPRRGPRI^^ 

1 MSTNPKPORKTiawnnUlPMDVKFPGGGQIVGGVyiJiPRRGPRI/^^ 

1 MSTNPKp5wcriCRWnnWPMDVKFPGGGQIVGGVYliPIU«^ 

1 MSTOPKPORKTKRNTNWlPMDVKFPGGGQlVOGVm-PWWjPI^ 

1 MSTKPKPQWCI^a«^^^^RPMDVKFPGGGQlVGG^nfLIJ»I«« 

1 MSTNPKPQWCnOlKTOIUlPimVKFPGGGQr^roGVyiJ-^ 

1 MSTNPKPQRiCrKWmiRRPMDVKFPGGGQIVGGVYIJLiPIWGPRIX^^ 

MSTNPKPQRirriaiimiRRPMDVKFPGGGOIVGG\nri^ 



SEP ID NO: 
194 
193 
192 
195 
196 
191 
197 

191-197 



Zl 
Z8 
Z6 
Z7 
Z4 
0K13 

consensus 



62 RQPIPqAIUlSEGRSmQPGYPWPLYGKBGCGWAGWIXSPRGSRPSWGqiroPRSUlSIUn^ 
62 ROPZPKAIUISEGIISHAQPGYPWPLYGKEGCGWAGVIXSPIIGSRPSWGPW 
62 TOPIPKARRSEGRSWAQPGYPWPLYGNEGCGWAGWLLSPRGSRPSWGPKDPIWISRI^ 
62 ROPXPKAMKSBGRSWAQPGYPWPLYGNEGCGWAGVUSPRGSRPSWGPNDPIUIRSRN^ 
62 ROPlPKARRSSGRSWAQPGYPWPLYGMEGCGKAGWLLSPmjSRPSWGPimPRraSRNIXj 
62 ROPIPKAROpEGRSWADPGYPWPLYGMBGCGWAGWLLSPRGSRPSHGPKDPIUUlSR^ 
62 RQPIPKARQlEGRSWAQPGYPWPLYGHEGCGWAGWZUPRGSRPSWGPNDPIUUlS 

RQPIPkARrsEGI^WJ^PGYPWPLYGIIEGCGWAGHLLSPRGSRPSWGpNDPRRRSI^ 



193 
192 
195 
196 
191 
197 



;Q IP y?; ISQLK 



if 



Zl 
Z8 
Z6 
27 
Z4 
0K13 



191-197 consensus 



123 IIOTiTOSFADliCrYIPLVGAPVGGVWUaJaiGVWaETC 

123 iDTLTCGFADIJMSYIPLVGAPVGGVMAIjaiGVRAVSDGIinrATGm.^ 

123 IDTLTCGFADXlfi^YZPLVaAPVGGVAIUIJUIGVIUVBDGZinrAT^^ 

123 ZDTLTCGFWX2CTZPLVGAPVGGV3UUIJU1GV1UIVEXX3ZKYAT^^ 

123 lZ>TLTCGFADLMGYIPLVGAPVGOVWUaiAHGVIUaEDGIOT 

123 lim*TC0PADU1GYIPiVGAPVGGVWUIJUlGVRAvBD0INYATGm*PGC^ 

123 Iim*TCGFADUCTIPvVGAPVGGVAWaJ«GV1UlEDGvNYATGNLPGCSFS 

IDTLTCGFADI^YIPlVGAPVGGVAJlAIJWGVRavEDGiMYATGNLPGCSFSIFUJ^ 



SEQ IP 

194 

193 

192 

195 

196 

191 

197 



Zl 
Z8 
Z6 
Z7 
Z4 
DK13 



184 CLTTPASA 
184 CLTTPASA 
184 CLTVPASA 
184 CLTVPtSA 
, 184 CLTVPASA 
/ ) 184 CLTVPASA 
184 CLTVPASA 



191-197 



consensus 



CLTvPaSA 
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ID NO: ISSl 



m 

203 
198 
199 
200 
203 
201 
204 



198-205 



m 

SA3 
SA4 
SA5 
SA7 
SA13 
SAl 
SA6 



consensus 



1 MSTNPKPQRICnaamiRRPQDVKFPGGGQIVGGVYLIJ^RroPRI^^ 
1 MSTNPKPQWCnOUITNRRPODVXFPGGGQIVGGVYlXPRWjPRLG^^ 
1 MSTNPKPQWCTTCRiriOTRPODVKFPGGGQIVGGVVlJJRRGPIUXr^ 
1 MSTOTKPQWCnCRNTNRRPQDVKPPGGGQIVGGVYlJ-PW^ 
1 MSTOTKPORICnaiirrNIWPQDVKFPGGGQrVGGVVlJjPIUUS 
1 MSTOPXPQR K T IUO mniRPQPVKFPGGGQIVGGVYIJiPRTOPRLGVRCT 
1 MSTin»KPORKTKRNTNlW»QDVKFPGGGOrVCX7(mXPRW5PRl^^ 
1 MSTOTKPQWO'qROTWrl^QDVKFPGGGQIVGGVyiXPRRGPRmGV^ 

MSTNPKPQWCTkWbrrarRPQDVKFPGGGQrVGGVYtt 



XP TO; 



m- 

202 
198 
199 
200 
203 
201 
204 



19B-20S 



iSll 62 ROPIPKAROPTGRSWGQPCOTWPfYJOTGLgWaGWIJ*SPRGSRPnWGProPRRrSRlsn^ 

SA3 62 ROPIPKAROPTGRSWGQPGyPWPLYANEGLeWAGin-LSPRGSIU»sWGPNDPRIU^ 

SA4 62 RQPIPKARQPTCRS»roQPGyPWPLYJU«E0DCWW5WIJ*SPR^ 

SA5 62 RQPIPKARQPTGRSWGQPGYPWPLYANEGDGWW^WIXSPRGSRPR^ 

SA7 62 RQPIPKARQPTQRSTOQPGyPWPLYANBGIXSWWJWI^ 

SAia 62 ROPIPKARePTGRSWGQPGYPWPLYiOTGLGWAGWIJ^PRGSRPNWG^ 

SAl 62 ROPIPKARQPTGRSWGQPGYPWPLYJUIEGl-GWAGWlJ^PRGSRPtTOPNDPRRKSRKW 

SA6 62 RQPIPKARQsaGRSWGQPGYPWPLYi«IEGU5WAGWLI^PRGSRPNWGPNDPRRl« 

consensus RQPiPKARQptGRSWGQPGYPWPlYANEQl*gWAGWU;*SPRGSRPnWGPlSIDPRRkSRKLGKV 



202 
198 
199 
200 
203 
201 
204 



198-205 



ID NO: ISO! 



SA3 
SA4 

SA5 
SA7 
SA13 
SAl 
SA6 



consensus 



123 IDTLTCGFADIJ«mPIArtWPVGGVARAIJWGVRAl*EDGVirc^^ 

123 IDTLTCGFADIlCYlPlAroGPVGGVAlUVIJWGVRVLKDOVNYATGIi^ 

123 IDTLTCGFADIJfflYIPLVGGPVQGVAKAIJUIGVllVLSDGV*nr^^ 

123 IDTLTOTFADLICTIPLVOGPVGGVARAIJUIGVRVLBDOWYATGOT^ 

123 IDTLTCGFADI^aGYIPLVGGPVGGVARALAHGVRVICTOVIsrfATGOTiPGCSFSIFIl^^ 

123 IDTLTCGFADI2!GYIPLVGGPVGGVARAIJ«GVR\7IOTG\«n'ATGNLPGCSFSIFIIAI^ 

123 IDTLTCGFADLMGYIPLVGGPVGGVARALAHGVRVI£DGVNYATGm*PGCSFSIFIIJa^ 

123 IDTLTCGFADLMGYI PLVGGPVGGVARAliAHGVRVI*BDGVKYATGMLPGCS FS I PvLALJ*S 

IDTI.TCGFADIM5YI PLVGGPVGGVARALAHGVRvLEDGVNYATGNLPGCSFS I Fi LALLS 



205 
202 
198 
199 
200 
203 
201 
204 

198-205 



SA3 
SA4 
SA5 
SA7 
SA13 
SAl 
SA6 

consensus 



184 CLTVPAtA 
184 CLTVPASA 
184 CLTVPASA 
184 CLTVPASA f \ 
184 CLTVPASA 
184 CLTVPtSA 
184 CLiiPASA 
184 CLtvPASA 

CLtvPfltsA 



wo 96/05315 



PCTA3S95/10398 



87/89 



3 »»sxKae 

i iiiiii 

?S88S8 

CU CLOiOi OiflU 

>^>4>4>i>« 



o 

Cd 
C 

2 

:t 
o 

M 



CUCL fiu fit ai fiu 



UH 



a, 
a 

oc 
cc 

OS 

o 

CO 

ct 
w 



m 

a 
a 

o 

Qfi 



a Bu o OA 
watt wioor 

fiUAiCUOi Otfib 
MM M M MM 
OtCU CUOa OUOb 

atKKKoeK 

OOOOPQ 

oooiaoio 

cocncocQCnco 
aao&oeao: 

CO Cfi OA to fid fiO 

cocQCQwcncn 
as«SS5S 




OC3 OC? OO 

aaeccQiafle 




o 3 

U* (Xt Ui Ut (h U{ b* 

Ou Ot Qt Oi Ot < . 

oeaotSuS 




Qt OiQt DtOt Ot Q» 



coviacf) tsm 

OiOiOt CUOtOt 




CO COCQCOCOCQCO 

ti» tbChbtbtbtbu 

CO men CO CO CO CO 

U 



2 2ZZZZZ 

§ iiiiii 




Oi OtOia>Qi&Ai 

M El»Jr^^> 

01 Ot Ot 0( O* Di Oi 
^ M««4M M M M 
>• >«>«>•>«>•>• 

? mm 

M M M M M M M 

2 ssaass 



CO COCOCOCOGOCO 

cu 
o 



QiOtOiOtOi Oi 

OiQiOt aotOi 



a acoco cL 
GuoiaotOtO. 

CO COCQCOCOCOCO 
0» OlOtOtOtOtOi 
,4 



MM mu) 



« 0) V 4) « 0) 



Mr«m viAO 
«> O « 0) fl> 



M M M MMMMM 



Oin 



COIm r^MMMM 



(HMMMMN 

I I I I I 

JliA m r- r* M f» 

UjlA lA r- 00 o\ o% 



W 96/05315 



PCT/US95/10398 



88/89 



s 



8 
I 



% 5&33SSS3333SSS 



2 sssssassssssas 



a— s 



" s 



s 



>>>>>> 



1^ « 

^ It 1 . i • » t 

< 1 • • • 

§ :;!:::!:!::::: 

-5 iiliiliiiiiiii 

*" 5 i ii>>>>>>>>>>> 

S * * ' 1 I I I I I I » I 1 I 



§ i : i :•:::::::: : 



a ; i ; i i i i i i I ; t I 1 
J 1 ! ^ J * J » : ! : : : : 







i 



ssss s s 

^ K> 4^ « *^ M ^ 9 SB 



s ISIS I t 



I" 



wo 96/05315 



PCT/US95/I0398 



O 
oe 



2 
i 



^ i- 

o 



h 

o 
o 

is- 
1- 
i- 



89/89 




U9 

m 
z 
m 




h 



1 



I- 



i 



i 



rn 



2 

o 
m 
z 



^Express Mail No. EG2 969 1 0385US 



WORLD INTELLECTUAL PROPERTY ORGANOZATION 
International Bureau 




PCX 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) Interaational Patent Classffi'tation ^ : 

C12N 15/51, C07K 14/18, COIN 33V53, 
A61K 39/29, C12Q 1/68, 1/70, C07K 16/10 



A3 



(11) International Publication Number: WO 96/05315 

(43) International Publication Date: 22 February 1996 (22.02.96) 



(21) International AppUcaUon Number: PCT/US95/ 10398 

(22) Internationa] Filing Date: 15 August 1995 (15.OS.95) 



(30) Priority Data: 

08/290.665 



15 August 1994(15.08.94) 



US 



(71) Applicant: THE GOVERNMENT OF THE UNITED STATES 

OF AMERICA, leprtsentcd by THE SECRETARY, DE- 
PARTMENT OF HEALTH AND HUMAN SERVICES. Of- 
fice of Technology Transfer National Institutes of Health 
[US/US]; Suite 325, 6011 Executive Boulevard. Rockville, 
MD 20852 (US). 

(72) Inventors: BUKH. Jens; 5805 Sonoma Road, Bcthesda, MD 

20817 (US). MILLER, Roger, H.; 15504 White Willow 
Lane, Roclcvillc, MD 20853 (US). PURCELL, Robert, H.; 
17517 White Grounds Road, Boyds. MD 20841 (US). 

(74) Agent: FEILER, William, S,; Morgan & Finncgan, LiP., 345 
Paric Avenue, New Yorlc. NY 10154 (US). 



(81) Designated States: AM. AT, AU, BB, BG, BR, BY. CA. CH, 
CN, CZ, DE, DK, EE, ES, FI, GB, GE, HU, IS. JP, KE, 
KG, KP, KR. KZ, LK, LR, LT, LU, LV, MD, MG, MN, 
MW, MX, NO. NZ, PL. PT, RO. RU, SD, SE, SG, SI, SK, 
TJ, TM, TT. UA, UG, UZ, VN. European patent (AT. BE, 
CH, DE, DK, ES. FR. GB, GR, IE, IT, LU, MC, NL, PT, 
SB), OAPI patent (BF, BJ, CF, CG, CI. CM, GA, GN. ML, 
MR, NE, SN, TD, TG), ARIPO patent (KE, MW, SD, SZ, 
UG). 



Published 

With imemationai search report 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments, 

(SS) Date of publication of the international search report: 

4 April 1996 (04.04.1996) 



(54) TItk: NUCLEOTIDE AND AMINO ACID SEQUENCES OF THE ENVELOPE 1 AND CORE GENES OF HEPATITIS C VIRUS 
(57) Abstract 

The nucleotide and deduced amino acid sequences of cDNAs encoding the envelope (1) genes and core genes of isolates of hepatitis 
C virus (HCV) are disclosed. The invention relates lo the oligonucleotides, peptides and recombinant envelope (1) and core proteins denved 
from these sequences and their use in diagnostic methods and vaccines. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the ftwt pages of pamphlets pubhshing international 
applications under the PCT. 



AT 


Austru 


GB 


Uaited Kingdom 


MR 


Mauritania 


AU 


AusoiUa 


GE 


Geoffia 


MW 


Malawi 


BB 


B«rtMdo> 


GN 


Guinea 


NE 


Niger 


- BE _ 


^ BeltiuiD 


GR . 


Greece . 


-NL 


Netherlands— 


BF 


Bititma Ftso 


HU 


Hungary 


NO 


Norway 


BG 


Bulgaria 


IE 


Ireland 


HI 


New Zealand 


BJ 


Benin 


IT 


Italy 


PL 


Poland 


BR 


Brazil 


JP 


Japan 


PT 


Poftugal 


BY 


Belarta 


KE 


Kenya 


RO 


Romania 


CA 


Canada 


KG 


Kyigysun 


RU 


Russian Fcderaiion 


CF 


Cencral Afncan Republic 


KP 


Democratic People's Republic 


SD 


Sudan 


CG 


Coofo 




of Koiea 


SE 


Sweden 


CH 


Switzerland 


KR 


Republic of Korea 


SI 


Slovenia 


CI 


Cflte d'lvotrt 


KZ 


Kazakhsian 


SK 


Slovakia 


CM 


Camerooo 


U 


Liechtenstein 


SN 


Senegal 


, CN 


Chioa 


UC 


Sri Lanka 


TD 


Chad 


cs 


CzediMlovakia 


LU 


Luxembourg 


TG 


Togo 


cz 


Cttch Republic 


LV 


Latvia 


TJ 


Tajikiiun 


DE 


Genna&y 


MC 


Monaco 


TT 


Trinidad and Tobago 


DK 


Dcflinartc 


MD 


Republic of Moldova 


UA 


Ukraine 


ES 


Spain 


MG 


Madagascar 


US 


United States of America 


n 


Finland 


ML 


Mali 


UZ 


Uzbekistan 


FR 


France 


MN 


Mongolia 


W 


Viet Nam 


OA 


Gabon 











INTERNATIONAL SEARCH REPORT 



Inter mil ApplicMoo No 

PC I /US 95/10398 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 5 C12N15/51 i:07K14/18 
C12Q1/70 CG7K15/10 



G01N33/53 A61K39/29 C12Q1/68 



Accordint to IntcimBonil Patent Qmfionon (IPQ or to both n»non«l ciMnfication >nd IPC 



B. FIELDS SEARCHED 



Mimmum doeumentwian lorched (dasnfication tysiem followed by dtssificAtion symboli) 

IPC 6 C12N C07K G01N A61K C12Q 



Documesution «e*«hed other thin minimum documenution to the extern Uui «uch docuxnentt ire included m the fields seirched 



Electronic dita bue consulted dunng the intenutionil leirch (name of diu base ind. where praeticil. search lenm used) 



C. DOCUMENTS CONSIDERED TO BE RELEVA>rT 



Cilefory ' Quooo of document, with mdicaoon, where ipprDpniic. of the relevant pasuges 



Relevant to claim No. 



PROC. NATL. ACAD. SCI. USA., 
vol. 89, no. 11, June 1992 
pages 4942-4946, 

JENS BUKH ET AL. 'Sequence analysis of 
the 5' noncoding region of hepatitis C 
virus. ' 

see the whole document 



1,2,21, 
31,32,38 



5-20. 

22-30. 

33-37. 

39-51. 

53,54. 

56-59 



Furttw doc u nic na are Ustcd m the oootimiAboo of box C. 



Patent family members arc listed in annex. 



Speciil catefonei of otcd documents : 

A* dociancnt dcfinmt the leneral «ute of the art which IS not 

Gooadered to be of pirbcular relevaaee 
'E* earlier document but publiihed on or after the tmonatkonal 

fiknf date 

X' documert which may throw doubts on pnontydaimWor 
wtMch is atcd to establish the puUicauon dau of another 
dution or other spcdil reason (u fpcafted) 

• ' doaroem rtfemng to an oral disdofwt, use. exhibihon or 
cttiff means 

•P' docianent puWiAed pnor to the intemanonal fUm* date but 
later than the pnonty dau daimed 



*T* liBer document BtMidwd after die mtemasianil filing date 
or priority date^Sdnoi in conflict widi the apphcmon but 
ated to tnderstand the pnnbple or theory undcriymg ttic 



•X- docianentofpaiticultrrdevance;thedMmediw«J^ 
cannot be cuiMdered novd or cannot be consdercd to 
involve an inveMve step when the document is taken alone 

•Y* doctanent of partxcutar relevance; the datmed invention 
cannot be coSdered to involve an "wenbvertep wh^ 
document is combined with one or more oth er sudi docu-^ 
mmts. aich comhmation bemg obvious to a penon daUed 
mthe art. 

•&* document member of the same patent family 



Date of the actual completion of the mtdiuoonal search 

20 February 1996 



Dim of mailmg of the intemioooal search report 



0 1. 03. 96 



Name mailing address of the ISA 

European Patent Office. P.B. SSI 8 Patcndaan 2 
NL . 22M) HV Ruswijk 
Td. ( ♦ Jl-70) 540-2040, Tx. 31 631 epo nl. 
Fax: ('^ 31-70) 340-301 & 



Authorized officer 

Hix. R 



Fonn PCT/1SA/3I0 (Mcsa4 ftett) (July 1992) 



page 1 of 5 



INTERNATIONAL SEARCH REPORT 



inter inal Applicaoon No 

PCT/US 95/10398 



CXCooaoua&on) DOCUMENTS CONSIDERED TO BE RELEVANT 



CAtsfovy * QUQon of doctanenu wilh indication, wtwrc appcopnate, of the relevant passages 



Relevant to claim No. 



PROC NATL ACAD SCI U S A 90 (17). 1993. 
8234-8238. CODEN: PNASA6 ISSN: 0027-8424. 
September 1993 

BUKH J ET AL 'AT LEAST 12 GENOTYPES OF 
HEPATITIS C VIRUS PREDICTED BY SEQUENCE 
ANALYSIS OF THE PUTATIVE El GENE OF 
ISOLATES COLLECTED WORLDWIDE.' 
see the whole document 



JOURNAL OF GENERAL VIROLOGY 75 (5). 1994. 
1053-1061. ISSN: 0022-1317. 
May 1994 

SIMMONDS P ET AL 'Identification of 
genotypes of hepatitis C virus by sequence 
comparisons in the core. El and NS-5 
regions. ' 

see the whole document 



JOURNAL OF BIOMEDICAL SCIENCE 1 (3). 1994. 
158-162. ISSN: 1021-7770, 
June 1994 

KAO J-H ET AL 'Detection of divergent 
hepatitis C virus envelope sequences.' 
see the whole document ' ' 



BIOCHEN BIOPHYS RES COMMUN 192- (2). 1993. 
635-641. CODEN: BBRCA9 ISSN: 0006-291X, 
30 Apri 1 1993 
STUYVER L ET AL 'ANALYSIS OF THE PUTATIVE 
El ENVELOPE AND NS4A EPITOPE REGIONS OF 
HCV TYPE 3.' 
see the whole document 



1.2,5. 
21-23, 
26-32,38 



5-20,24. 
25, 

33-37, 
39-51, 
53,54. 
56-59 

21,28 



1.2, 

5-20, 

22-27. 

29-51. 

53,54, 

56-59 

21,28 



1.2, 

5-20, 

22-27, 

29-51, 

53,54, 

56-59 

21,28 



1.2. 

5-20. 

22-27, 

29-51, 

53.54, 

56-59 



-/-- 



Fom PCT/ISA/aiO IflMtwtMiioa «rf mami stiwt} (July If 93) 



page 2 of 5 







INTERNATIONAL SEARCH REPORT 


Inte* onaJ Applicabon So 

PCT/US 95/10398 




CXConttnuitioo) DOCUMENTS CONSIDERED TO BE RELEVANT 




Category * 


QUQon of document with indicaQon. where appropnaie, of the relevant passages 


Relevant to claim No. 




X 
Y 


ARCHIVES OF VIROLOGY SUPPLEMENTUM 0 (7). 
1993. 27-39. ISSN: 0939-1983. 
ROGGEMDORF M ET AL 'Variability of the 
envelope regions of HCV in European 
isolates and its significance for 
diagnostic tools. ' 
see the whole document 




21,28 

1,2, 

5-20, 

22-27, 

29-51. 

53.54, 

56-59 




X 
Y 


PROC. NATL. ACAD. SCI. U. S. A. (1992). 
89(15), 7144-8 CODEN: PNASA6;ISSN: 
0027-8424. 

1 August 1992 
CHA, T. A. ET AL 'At least five related, 
but distinct, hepatitis C viral genotypes 
exist' 

see the whole document 




21.28 

1.2. 

5-20. 

22-27. 

29-51. 

53.54. 

56-59 




X 
Y 


BIOCHEM. BIOPHYS. RES. COMMON. (1994). 
199(3), 1474-81 CODEN: BBRCA9;1SSN: 
00e6-291X. 
30 March 1994 

LI. JI-SU ET AL 'Identification of the 
third major genotype of hepatitis C virus 
in France' 

see the whole document 




21.28 

1.2. 

5-20, 

22-27, 

29-51, 

53.54. 

56-59 




X 
Y 


WO. A. 94 01778 (CHIRON CORP) 20 January 
1994 

see the whole document 




53,54. 
56-58 
1.2. 
5-51.59 




X 
Y 


WO, A. 92 19743 (CHIRON CORP) 12 November 
1992 

see the whole document 

-/-- 




21.28 

1.2. 

5-20. 

22-27, 

53.54. 
56-59 



Form PCT/lSA/aiO (aintiewaUoa ol neond UtMi) {July Itn) 



page 3 of 5 



INTERNATIONAL SEARCH REPORT 



Inierr Hul AppliMQon No 

PCI/US 95/10398 



C^Conllttuatian) DOCUMENTS CONSIDERED TO BE RELEVANT 


Catcfory' 


Outran of document, with indication, where appropnate, of the relevant passages 


Relevant to claim No. 


Y 


WO, A, 92 21759 (PASTEUR INSTITUT) 10 


1.2, 




Decernber 1992 








54,56-59 




see the whole document 


X 


— 

EP.A,0 586 065 (TOMEN CORP) 9 March 1994 


53.54, 






56-58 




see the whole document 




X 


PROC. NATL. ACAD. SCI. USA, 


21,28 




vol Januarv IQQ? 




pages 187-191, 






J. BUKH ET AL * Importance of primer 






selection for the detection of hepatitis C 






virus RNA with the polymerase chain 












see the whole document 




P.X 


— 

WO, A, 95 01442 (US HEALTH) 12 January 1995 


1.2, 






5-51,53, 






54,56-59 




see the whole document 


P.X 


— 

WO, A, 94 25601 (INNOGENETICS NV ;MAERTENS 


21,28 




GEERT (BE); STUYVER LI EVEN (BE)) 10 




Noveinb^r 19Qd 




Y 


see the whole document 


1.2, 






5-20, 






22-27, 






29-51, 






53,54, 






56-59 


P.X 


WO, A, 94 27153 (CHIRON CORP) 24 November 


2i,28, 




1994 


53-58 


Y 


see the whole document 


1.2, 






5-20, 






22-27, 






29-51.59 


-P.X 


— 

PROC. NATL. -ACAD. SCI. U._S.,A. (1994), 


1.2. 




91(21). 10134-8 CODEN: PNASA6;ISSN: 


5-51.53. 




0027-8424, 


54.56-59 




11 October 1994 






STUYVER. LIEVEN ET AL 'Classification of 






hepatitis C viruses based on phylogenetic 






analysis of the envelope 1 and 






nonstructural 5B regions and 






identification of five additional 






subtypes * 






see the whole document 











Fotm PCT/UAQIO IcoAtmuattoa of mcoaA Umm) (July IW) 



page 4 of 5 



INTERNATIONAL SEARCH REPORT 



Intt ' onal Appltcabon No 

PCT/US 95/1Q398 



C^Conanuation) DOCUMENTS CONSIDERED TO BE RELEVANT 



Catefory * QUQon of document, wittrindicanon. where appropnate, of the relevant panagcs 



P.X 



P.X 



P.X 



JOURNAL OF CLINICAL MICROBIOLOGY 32 (9). 
1994. 2280-2284. ISSN: 0095-1137, 
September 1994 

RAVAGGI A ET AL 'Distribution of viral 
genotypes in Italy determined by hepatitis 
C virus typing by DNA imminoassay. ' 
see the whole document 

SEMINARS IN LIVER DISEASE, 
vol. 15, no. 1, February 1995 
pages 41-63, 

J. BUKH ET AL. 'Genetic Heterogeneity of 
hepatitis C virus: Quasispecies and 
genotypes ' 

see the whole document 

BIOCHEMICAL AND BIOPHYSICAL RESEARCH 
COMMUNICATIONS, 

vol. 202, no. 3, 15 August 1994 
pages 1308-1314, 

L. STUYVER 'Cloning and phylogenetic 
analysis of the core, E2 and NS3/NS4 
regions of the hepatitis C virus type 5a+' 
see the whole document 



Rclevint to cltim No. 



1,2, 

5-51,53, 
54,56-59 



1.2. 

5-51,53. 
54,56-59 



1.2. 

5-51,53, 
54,56-59 



Fern FCTASA/aia (amiauuiuui of mm* ikaq (iulf >**>) 



page 5 of 5 



INTERNATIONAL SEARCH REPORT 



^.emationai apptication No. 

PCT/ US 95/ 10398 



Box I Observaiipns where certain cUims were found unsearchable (Continuation of item I of firtt sheet) 



This international search report has not been estabUshcd in respect of certain daims under Article I7(2)(a) for the following reasons: 

1. [X] Claims Nos.: 18,45,49 

bcouise they relate to subject mailer not required to be searched by this Authority, namely: 

Remark: Although these claims are directed to a method of treatment of 

(diagnostic method practised on) the human/animal body, the search 
has been carried out and based on the alleged effects of the 
compound/composition. 

2. Q Claims Nos.; 

because they relate to parts of the international application that do not comply with the prescribed requirements to such 
an extent thai no meaningful international search can be carried out, spedfically; 



I I Claims Nos.: 

because they are dependeni daims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box 11 Observations where unity of invention is lacluDc (Contimiatian of item 2 of first sheet) 



This Internalional Searching Authority found multiple inventioxu in this international apptication, as follows: 

- 26 subjects 

See continuation-sheets PCT/ISA/210 



J- I I As all required additional search fees were timely paid by the ^licant, this international search report covers all 

searchable daims. 

' ^. 1^ ' 

2- I I As all searchable daims could be searches without effort justifying an additional fee, this Authority did not invite payment 

of any additional fee. 



3 D 

As only some of the required additional search fees were timely paid by the applicant, this international search report 
covers onjy those daims for which fees were paid, spedfically claims Nos.: 



4. No required additional search fees were timely paid by the applicant. ConsequenUy, this international sewch report is 
restricted to ihe invention fu'st mentioned in the daims; it is covered by claims Nos.: 

1,2,5-51,53,54,56-59 (partially) 



Rcnwk on Pretest [ | The additional search fees were accompanied by the applicant's protest 

I I No protest accompanied the payment of additional search fees. 
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FURTHER INFORMATION C NTINUEO FR M PCT/ISA/210 



claims : 

1. 1,2.5-51,53,54,56 to 59 (partially): 

Genotypes specific peptides from El Seq. ID 1-8 and 52-59 used 
in the reconbiannt protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype I/la. 

2. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 103-108 and 
155-160 used in the recombinant protein expression, detection 
of antibodies against HCV, vaccines and methods of detection 
using PCR primers and Identification of Genotype I/la. 

3. 1,2,5-51.53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 9-25 and 60-76 used 
in the recombiannt protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype Il/lb. 

4. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 109-124 and 161-176 
used in the recombinant protein expression, detetion of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype Il/lb. 

5. 1,2,5-51,53,54,56 to 59 (partially): 

Genotyoe specific peptides from El Seq. ID 26-29 and 77-80 used 
in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype III/2a. 

6. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 125-128 and 177-180 
used in the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype III/2a. 

7 1,2,5-51.53,54.56 to 59 (partially): 

5^5. vacciies and methods of detection using PCR primers and 
Identification of Genotype IV/2b. 
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8. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 129-133 and 181-185 
used in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype IV/2b. 

9. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 34 and 85 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype IV/2c. 

10. 3-52.55 and 59 (partially): 

Genotype specific peptides from Core Seq. 10 134 and 186 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype IV/2c. 

.11. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 35-39 and 86-90 used 
in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype V/3a. 

^12. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 135-138 and 187-190 
used in the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype V/3a. 



13. 1,2, 5-51, 53, 54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 40 and 91 used in the 
recombiannt protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4a. 

14. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 139 and 191 used in 
the recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4a. 
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15. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 41 and 92 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4b. 

16. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 141 and 193 used 1n 
the recombiannt protein expression, detection of antibodies aglnst 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4b. 

17. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 42-43 and 93-94 used 
in the recombinant protein expression, detectin of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 4c. 

18. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 143-144 and 195-196 
used in the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype 4c. 

19. 1,2,5-51,53.54,56 to 59 (partially): 

Genotype specific peptides from El Seq, ID 44 and 95 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4d. 

20. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 145 and 197 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 4d. 

21. 3-52,55 and 59 (partially): 

Genotype specific peptides Core Seq. ID 142 and 194 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4e. 
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11. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 140 and 192 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and metjods of detection using PGR primers 
and Identification of Genotype 4f. 

23. 1,2,5-51,53.54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 45-50 and 96-101 used 
in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 5a. 

24. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. 146-153 and 198-205 
used in the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype 5a. 



25. 1,2.5-51,53,54.56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 51 ans 102 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 6a. 

26. 3-52,55 and 59 (partially):^^ 

Genotype specific peptides from Core Seq. ID 154 and 206 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 6a- 
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